NTRODUCTORY. 
MODERN ALGEBRA 


x + yy+2 


Vv 
P — 
X X X 


Introductory Modern Algebra 


Introductory Modern Algebra 
A Historical Approach 


Second Edition 


Saul Stahl 


Department of Mathematics 
University of Kansas 
Lawrence, KS 


WILEY 


Copyright © 2013 by John Wiley & Sons, Inc. All rights reserved. 


Published by John Wiley & Sons, Inc., Hoboken, New Jersey. 
Published simultaneously in Canada. 


No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or 
by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as 
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior 
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to 
the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax 
(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should 
be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 
07030, (201) 748-6011, fax (201) 748-6008, or online at http://www. wiley.com/go/permission. 


Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in 
preparing this book, they make no representation or warranties with respect to the accuracy or 
completeness of the contents of this book and specifically disclaim any implied warranties of 
merchantability or fitness for a particular purpose. No warranty may be created or extended by sales 
representatives or written sales materials. The advice and strategies contained herein may not be 
suitable for your situation. You should consult with a professional where appropriate. Neither the 
publisher nor author shall be liable for any loss of profit or any other commercial damages, including 
but not limited to special, incidental, consequential, or other damages. 


For general information on our other products and services please contact our Customer Care 
Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or 
fax (317) 572-4002. 


Wiley also publishes its books in a variety of electronic formats. Some content that appears in print, 
however, may not be available in electronic formats. For more information about Wiley products, visit 
our web site at www.wiley.com. 


Library of Congress Cataloging-in-Publication Data: 


Stahl, Saul. 
Introductory modern algebra : a historical approach / Saul Stahl, Department of Mathematics, 
University of Kansas. — Second edition. 
p. cm. 

Includes bibliographical references and index. 

ISBN 978-0-470-87616-9 (cloth) 
1. Algebra, Abstract. I. Title. 

QA162.873 2013 

512'.02—de23 2013018928 


Printed in the United States of America 


10987654321 


Preface 


Contents 


1. The Early History 


1.1 


The Breakthrough 


2 Complex Numbers 


2.1 
22 
2.3 
2.4 
2.5 
2.6 


Rational Functions of Complex Numbers 
Complex Roots 

Solvability by Radicals I 
Ruler-and-Compass Constructibility 
Orders of Roots of Unity 

The Existence of Complex Numbers* 


3 Solutions of Equations 


3.1 
3.2 
3.3 


The Cubic Formula 
Solvability by Radicals II 
Other Types of Solutions* 


4 Modular Arithmetic 


4.1 
4,2 
4.3 
4.4 


Modular Addition, Subtraction, and Multiplication 
The Euclidean Algorithm and Modular Inverses 
Radicals in Modular Arithmetic* 

The Fundamental Theorem of Arithmetic* 


5 The Binomial Theorem and Modular Powers 


5.1 
5.2 
5.3 
5.4 


* Optional 


The Binomial Theorem 

Fermat's Theorem and Modular Exponents 
The Multinomial Theorem* 

The Euler g-Function* 


10 


11 


CONTENTS 


Polynomials over a Field 


6.1 Fields and Their Polynomials 

6.2. The Factorization of Polynomials 

6.3. The Euclidean Algorithm for Polynomials 
6.4 Elementary Symmetric Polynomials* 


6.5 Lagrange's Solution of the Quartic Equation* 


Galois Fields 


7.1 Galois's Construction of His Fields 
7.2. The Galois Polynomial 

7.3. The Primitive Element Theorem 
7.4 On the Variety of Galois Fields* 


Permutations 


8.1 Permuting the Variables of a Function I 
8.2 Permutations 

8.3. Permuting the Variables of a Function II 
8.4 The Parity of a Permutation 


Groups 


9.1 Permutation Groups 
9.2 Abstract Groups 


9.3. Isomorphisms of Groups and Orders of Elements 


9.4 Subgroups and Their Orders 
9.5 Cyclic Groups and Subgroups 
9.6 Cayley's Theorem 


Quotient Groups and Their Uses 


10.1 Quotient Groups 

10.2. Group Homomorphisms 

10.3. The Rigorous Construction of Fields 

10.4 Galois Groups and Resolvability of Equations 


Topics in Elementary Group Theory 


11.1 The Direct Product of Groups 
11.2. More Classifications 


99 


99 
107 
113 
120 
125 


131 


131 
139 
144 
147 


155 


155 
158 
166 
169 


183 


183 
192 
199 
206 
215 
218 


225 


225 
234 
240 
253 


261 


261 
265 


CONTENTS 


12 Number Theory 


13 


14 


50 A 


rm 


12.1 Pythagorean Triples 

12.2 Sums of Two Squares 

12.3 Quadratic Reciprocity 

12.4 The Gaussian Integers 

12.5 Eulerian Integers and Others 
12.6 What Is the Essence of Primality? 
The Arithmetic of Ideals 

13.1 Preliminaries 

13.2 Integers of a Quadratic Field 
13.3 Ideals 

13.4 Cancelation of Ideals 

13.5 Norms of Ideals 

13.6 Prime Ideals and Unique Factorization 
13.7 Constructing Prime Ideals 
Abstract Rings 

14.1 Rings 

14.2 Ideals 

14.3. Domains 

14.4 Quotients of Rings 


Excerpts: Al-Khwarizmi 


Excerpts: Cardano 


Excerpts: Abel 


Excerpts: Galois 


Excerpts: Cayley 


Mathematical Induction 


401 


405 


viii CONTENTS 


G Logic, Predicates, Sets, and Functions 


G.l 
G.2 
G3 
G.4 
G.5 
G.6 


Biographies 
Bibliography 


Truth Tables 

Modeling Implication 
Predicates and Their Negation 
Two Applications 

Sets 


Functions 


Solutions to Selected Exercises 


Index 


Notation 


413 


413 
415 
418 
419 
421 
422 


427 
431 
433 
444 
448 


Preface 


T Is COMMON KNOWLEDGE amongst mathematicians that much of modern algebra 
has its roots in the issue of solvability of equations by radicals. The purpose of this 
text is to provide the undergraduate mathematics majors and the prospective high school 
mathematics teachers with a one-semester introduction to modern algebra that keeps this 
relationship in view at all times. 

Most modern algebra texts employ an axiomatic strategy that begins with abstract 
groups and ends with fields, ignoring the issue of solvability of equations by radicals. By 
contrast, we follow the paper trail from the Renaissance solution of the cubic equation to 
Galois’s description of his ideas. In the process, all the important concepts are encountered, 
each in a well-motivated manner. 

One year of calculus provides all the information required for the comprehension of 
all the topics in this text, which has many distinguishing features: 

Historical development. Students would prefer to know the real reasons that underlie 
the creation of the mathematical structures they encounter. They also enjoy being placed 
in direct contact with the works of the prime movers of mathematics. This text tries to 
bring them as close to the source as possible. 

Finite groups and fields are rooted in some specific investigations of Lagrange, Gauss, 
Cauchy, Abel, and Galois regarding the solvability of equations by radicals. This text 
makes these connections explicit. Gauss’s proof of the constructibility of the regular 
17-sided polygon is incorporated into the development, and the argument given is merely 
a paraphrase of that which appears in the Disquisitiones. Similarly, the proof of Theo- 
rem 8.10 is just a reorganization of that given by Abel in his paper on the quintic equation. 
‘The construction of Galois fields is accomplished in the form of a commentary on the 
opening pages of Galois’s paper On the Theory of Numbers which are quoted verbatim in 
the text. Several important documents are also included as appendices. A considerable 
amount of historical discussion is integrated into the development of the subject matter. 

Cohesive organization. The historical development of the material allows for very little 


flexibility. Each chapter elucidates some of the preceding material and motivates ideas 


ix 


x PREFACE 


solution of 


the cubic equation 


Chapters 1 and 3 


solution of 
the quartic equation 


l Chapter 6 ] 


complex numbers, 
roots of unity 


| the cyclotomic equation 


l Chapter 2 


| 


the number of values 


of f(x) X-- 65 X,) 


Galois fields 


the quintic equation 
is not resolvable 


L Chapters 8 and 9 J 


' 
' 
‘ 
' 
1 
1 
1 
1 
‘ 
1 


s 


' 
J 
' 
Vv 
Galois theory 
of equations 


Figure o.r The genesis of the theory of finite groups. 


that come later. The advantage of this approach is the same as that of good motivation in 


general: it aids comprehension by providing the students with a framework in which to 


PREFACE xi 


fit che various concepts they encounter. A one semester course can be constructed on the 
basis of Sections 1.1, 2.1-5, 3.1-2, 4.1-2, §.I-2, 6.1-3, 7-1-3, 8.1-4, 9.I-5, & 10.1. 

Figure o.1 illustrates the author’s perception of the evolution of abstract group theory 
(ignoring all the geometric and much of the number-theoretic contributions). The 
number in the right of each box denotes the chapter in which this topic is discussed. 
Solid arrows correspond to connections that are treated in some depth whereas those that 
are displayed by dashed arrows are touched on only informally. 

Chapters 1 to 3 are dedicated to the formalization of the notion of solvability by 
radicals. Gauss’s proof of the constructibility of the regular 17-sided polygon is the 
capstone theorem of this part of the course. Field theory is developed in Chapters 4 to 7. 
The Primitive Element Theorem of Section 7.3 serves as a watershed: it unifies many of 
the important concepts that precede it and motivates the notion of cyclicity that comes 
later. Group theory is developed in Chapters 8 to 10. This begins with an explanation 
of the relevance of permutations to solvability by radicals, goes on to the discussion of 
permutation groups and abstract groups, and concludes with the description of quotient 
groups. Chapter 11 is meant to acquaint the students with some of the standard tools of 
elementary group theory. 

Exercises. Each section is followed by its own set of exercises. These range from the 
routine to the challenging. Each chapter has an additional set of easy review exercises 
added to remind the students of the chapter’s main points. There are over 1,000 of 
these end-of-section and chapter review exercises. The answers to selected odd exercises 
appear at the end of the book. Most chapters are also accompanied by a collection of 
supplementary computer and/or mathematical projects. Some of the latter involve open 
questions. 

Additional pedagogy. Each chapter begins with an introduction and concludes with a 
summary. The purposes of both the introduction and the summary are to provide the 
student with an overview of the chapter, and sometimes to comment on its relationship 
to the previous chapters. The examples are integrated into the exposition and they are 
highlighted by a notation in the margin. Each chapter’s new terms are listed, together 
with the pages on which they are defined, following that chapter’s summary. 

Instructor’s manual. An instructor’s manual is available. It contains the answers to all 
the end of section and chapter review exercises. Some suggested homework assignments 


and tests are also included. 


xii PREFACE 


Acknowledgments 


First and foremost I wish to acknowledge the substantial contributions made by Fred 
Galvin who rooted out several inaccuracies in the original development, improved and/or 
corrected many of the proofs, both in the text and the manual, suggested new exercises, 
and used the manuscript in his class. Thanks are also due to Todd Eisworth, Andy Magid 
and Phil Montgomery who also class tested the manuscript and made valuable suggestions 
as well as to my colleague Paul J. McCarthy who was kind enough to lend me both an 
ear and his algebraic expertise. It remains to gratefully acknowledge the efforts of Jessica 
Downey, Steve Quigley, Rosalyn Farkas, and Lisa Van Horn of John Wiley & Sons on 
behalf of this book. 


June 1996 


Preface to the Second Edition 


Surprisingly, it turned out that the historical approach could be used to teach ring theory 
as well. The point of departure is the Theorem of Pythagoras, viewed as a diophantine 
equation. Chapter 12 begins there and goes on to Fermat’s characterization of primes 
that are the sum of two square integers. From there we go on to quadratic reciprocity and 
the Gaussian integers. The question of Gaussian primes is natural and some attention is 
given to variant number systems with radicals —2 or /—3. The chapter ends with a 
discussion of Kummer’s decision to redefine the notion of primality. 

Quadratic fields, quadratic integers, and ideals are defined and the arithmetic of ideals is 
explored in Chapter 13. It is shown that the arithmetic of ideals does possess the unique 
factorization property. Finally, Chapter 14 discusses rings and ideals in the abstract 
manner of today. 

‘The author's understanding of the low level algebraic number theory in Chapter 13 
comes from reading one of Keith Conrad’s many expository monographs. The solutions 
to the selected exercises in Chapters 13 and 14 were derived by Grant Serio and are 
included with his permission. Katie Ballentine, Annika Denkert, and Mark Hunacek 


debugged portions of the manuscript, which was expertly typeset by Lon Mitchell. 


June 2013 


Saul Stahl 


Lawrence, Kansas 


Chapter 1 


THE EARLY HISTORY 


sl Mise CHAPTER CONTAINS an informal account of the early history of the issue of 
solvability of equations of degrees one, two, and three in a single unknown. The 
formulas that provide the solutions lead in a natural way to the discussion of the origins of 
complex numbers. We also take this opportunity to review some well-known information 


about the quadratic equation. 


1.1. The Breakthrough 


‘There is a general agreement among historians of mathematics that modern mathematics 
came into being in the mid sixteenth century when the combined efforts of the Italian 
mathematicians Scipione del Ferro, Niccold Tartaglia, and Gerolamo Cardano produced 
a formula for the solution of cubic equations. For the first time ever west European 
mathematicians succeeded in cracking a problem whose solution eluded the best mathe- 
matical minds of antiquity. Archimedes, one of the greatest mathematicians, scientists, 
and engineers of all times, had solved some cubic equations in terms of the intersections 
of a suitable parabola and hyperbola. Omar Khayyam, one of the most prominent of the 
Arab mathematicians and poets, also expended much effort on his geometrical solutions 
of special cases of the cubic equation but could not find the general formula. However, 
the significance of this accomplishment of the Renaissance mathematicians is not limited 
to the difficulty of the problem that was solved. We shall try to show how the issues raised 
by this solution eventually led to the creation of modern algebra and the discovery of 
mathematical landscapes that were undreamt of, even by such imaginative investigators 
as Archimedes and Khayyam. 

The interest in algebraic equations goes back to the beginnings of written history. The 
Rhind Mathematical Papyrus, found in Egypt circa 1856 is a copy of a list of mathematical 
problems compiled some time during the second half of the nineteenth century 8cE, or 


Introductory Modern Algebra, Second Edition. 1 
By Saul Stah! Copyright © 2013 John Wiley & Sons, Inc. 


2 THE EARLY HISTORY 


possibly even earlier. The twenty-fourth of these problems reads: “A quantity and its 1/7 
added become 19. What is the quantity?” In other words, what is the solution to the 
equation 
x 
x+-=19? 
7 
The method employed by the scribe has come to be known as the method of false position. 


He replaces the unknown by 7 and observes that 
7 

7+=-=8. 
7 


From this he concludes that the correct answer is obtained upon multiplying the first 


guess of 7 by 19/8: 
19 133 
x=7-—=—. 
8 8 
Interestingly enough, the scribe does double check his solution by substituting it into the 


original problem and verifying that 


133 133/8 
—$. +$“ ———— = 


19. 
8 7 


We will not discuss the merits and limitations of the method of false position except to 
note that the idea of obtaining a correct solution to an equation by starting out with 
a possibly false guess and then modifying that guess has been refined into powerful 
techniques for finding numerical solutions, one of which will be described in Section 3.3. 
We do, however, wish to point out that the general first-degree equation is today defined 
as 

ax+b=0, a#0, 


and that the rules of algebra yield 


as its unique solution. 

‘The Mesopotamian mathematicians of that time could solve much more intricate 
equations, and had in fact already developed techniques for solving what we nowadays call 
quadratic equations. These techniques employed the geometrical method of “completing 
the square.” The Greeks, Indians, and Arabs all were aware of this method, having 
either derived them independently or perhaps learnt them from their predecessors and/or 


neighbors. In the ninth century the Persian mathematician al-Khwarizmi (., 4054 aiace 


THE BREAKTHROUGH 3 


ee: Neal #3) wrote the book Hisab al-jabr w‘al-muqa-balah (Lim (3 pars! Ob! 
alae) \y ,es!) in which he carefully explained a compendium of algebraic techniques 
learnt from several past civilizations. The clarity of his exposition won both him and his 
book immortality in that the portion al-jabr of the title evolved into the word algebra, 
and the author’s name is the source of the word algorithm. An excerpt from this book 


expounding the solution to the quadratic equation 
x? + 10x = 39 


appears in Appendix A. The modern solution of the quadratic also relies on the completion 


of the square. The general quadratic equation has the form 
ax’ +bx+c=0, a#0, (1.1) 


and its solutions are found by first factoring out the coefficient 2 and then completing 
the rest to a perfect square. Thus, we first divide Equation 1.1 through by a to obtain 


the equation 
Pe ae 
x°+—x+-=0. (1.2) 
a 4 


The left side of Equation 1.2 is then transformed to a near perfect square: 


5 b c b\? 8 ¢ 
alee in(x4s] -—5+- 
a a 


2a 4a* a 
( ay b? —4ac 
=(x+— ] —-——. 

2a 4a? 


The original quadratic equation has thus been transformed to 


( ~) b? —4ac 
x+— }] ———~— =0 


or 


( b ) 6? —4ac 6 +4vVb*—4ac 
x+—) =—— o0r x + — = ————_. 
2a 4a* 2a 2a 


Hence the general quadratic equation, Equation 1.1, has the two solutions 


_ —b+ vb? —4ac —-b—vVb* —4ac 


x; 3 
a 2a 


an x= 


4 THE EARLY HISTORY 


It is clear that if a, 6, and c are real numbers, then these two solutions are real and 
distinct if 6* —4ac > 0, they are real and identical if 6? — 4ac = 0, and they are imaginary 
and distinct if 6 —4ac <0. Another important fact to bear in mind (Exercises 1.1.5 


and 1.1.6) is that 
c 


x, +%,=—-— and xx,=-, 

a) Samia as eee 
from which it follows that it is easy to construct a quadratic equation whose roots are 
prespecified. As we will have several occasions to refer to these identities later, they are 


stated as a proposition whose proof is relegated to Exercise 1.1.14. 


Proposition 1.3 For any two numbers r and s the quadratic equation 


x?—(r+s)x+rs=0 


has r and s as its roots. 


It is reasonable at this point to raise the ante and ask for a formula that will yield the 


solution of the general cubic equation 
axr+ bx? +ext+d=0. (1.4) 


There are indications that the Mesopotamians already tried to systematize the search for 
solutions of cubic equations, and we know for a fact that the Greeks attempted the same. 
As was mentioned above, the final breakthrough did not occur until the middle of the 


sixteenth century when it was shown that a solution of the equation 
3 = 
x’ + px+q=0 


is given by the expression 


x= -9/2+ 47/44 p?/27-¥ 4/24 V 97/44 p?/27. (5) 


As we shall see later, very little additional work is required to pass from this formula on 
to a formula for the general cubic equation (Equation 1.4), and so Formula 1.5 can be 
considered as the crucial step, even though it does not yield the solution to the most 
general cubic equation. 

In analogy with the ancient solutions of the quadratic, this solution was obtained by 


a geometrical process of completing the cube. Excerpts from Cardano’s description of 


THE BREAKTHROUGH 5 


the solution are contained in Appendix B. A modern derivation of this formula appears 
in Chapter 3, and we restrict ourselves here to the examination of some instructive 
applications of Formula 1.5. Surprisingly, this formula raises some very interesting 
questions. 

Consider the cubic equation x? — 1 =0. Here p = 0 and g =—1, and so Formula 1.5 


yields 


x= 1/24 4/1/440—-V—1/24 /1/440= V1/24 1/2- Y/-1/241/2=1, 


which is as it should be. However, for the equation x* + 6x — 20 = 0, which Cardano 


uses as an illustration in his Ars Magna, the same formula yields the solution 


x= V 10+ V100+8—V—10+ V100+8 = V V108+ 10—V V108—10. 


It can be easily verified with the aid of a calculator that the above solution agrees with 
2 to at least eight decimal places, and the mathematical verification that the agreement 
is absolute is left to Exercise 1.1.1. Our purpose in presenting this example was to 
draw attention to the possibility that Formula 1.5 may present a correct solution in an 
unnecessarily complicated form. This obfuscation becomes much more disturbing in the 
case of the equation x? ~— 15x —4 =0, treated by Rafael Bombelli in his Algebra (1572). 


Formula 1.5 yields the solution 
x= V 24 ¥—121- V-24 V-121. (1.6) 


However, it is easily verified by inspection that x = 4 is also a solution of this cubic, and, 
since 


x3 —15x—4=(x —4)(x? +4x 41), 


two more solutions of the original equation are obtained by solving the quadratic 
x? +4x+1=0. 


As the solutions of this quadratic are —2 + V3, we are faced with the question of which 
of the three numbers 4 or —2 + /3 is disguised as Expression 1.6. Moreover, this com- 
plicated expression involves square roots of negative numbers, in other words, imaginary 


quantities, whereas 4 and —2 + J3 are all real numbers. This apparent paradox was 


6 THE EARLY HISTORY 


resolved by Bombelli who simplified Expression 1.6 by setting 
V24V—I21 =atbv—1, 


cubing both sides and deriving a = 2 and 6 = 1 from the resulting simultaneous equations. 


Rather than exhibit the details of his solution we simply point out that indeed 


(24 Va) =2343-22v=143-2-(v=l) +(Val), 
=8+12V-1-6-Vv-1 
=24+11¥—-1=24+V7-121 


and similarly 


(-2+ val) =-2+ Y=i2h 


Consequently, 
V 24 ¥=121— 24 V-121 =2+ vV—1—(-24 v=) =4. 


Thus, users of the cubic formula ignore the so-called imaginary numbers at their peril. 
Such prejudices come at the cost of losing some real solutions to real equations. This is 
further borne out by the innocent-looking equation x? — 3x = 0. Formula 1.5 yields the 


solution 


en) JW VEN. 


and even if one is very skeptical about the existence of imaginary quantities it is very 
tempting to believe in them just long enough for the above radicals to cancel out and to 
yield the root x = 0, which we know to be correct. 

The solution to the cubic equation is the context within which imaginary numbers 
were first discussed by mathematicians. Cardano toyed with them and then rejected 
them as useless. Bombelli gave them more credence, but it wasn’t until about 200 years 
later that the work of Leonhard Euler, Pierre-Simon de Laplace, and later that of Carl 
Friedrich Gauss, Augustin-Louis Cauchy, and Niels Abel turned the complex number 
system, consisting of both the real and imaginary numbers, into an indispensable tool 
for mathematical researchers. 

The Ferro-Tartaglia~-Cardano Formula 1.5 suffers from a serious deficiency. This for- 


mula yields at most one solution for any cubic equation, even when such an equation 


EXERCISES 7 


is known to have three distinct real roots, as is the case for x? — x = 0 whose roots are 
0 and +1. In view of the fact that the quadratic formula of Equation 1.1 does succeed 
in incorporating all the solutions into one expression it would not seem unreasonable to 
expect the same of the cubic counterpart. As we shall see in the next chapter, the complex 


numbers will enable us to find just such an expression. 


Exercises 1.1 


1. Prove that V V108 + 10—¥/ V¥108—10=2. 

2. Prove that +/28—1073— 7-43 =3. 

3. Solve the equation 3x” —2x—2=0. 

4. Solve the equation x* — 3x7 +2=0. 
If r and s are the roots of the quadratic equation ax? + bx +c = 0, prove the identities 
in Exercises 1.1.5 to 1.1.7. 

Ss r+s=—b/a 7 re +s?=(b*—-2ac)/a’ 


6. rs=c/a 


If r and s are the roots of the quadratic equation ax? + bx +c =0, rewrite the expressions 
in Exercises 1.1.8 to 1.1.13 in terms of a, 6, and c. Wherever necessary, you may assume 
that the denominators are not zero. 

8. 1/r+1/s 10. r?s+rs* Toss Tyre bs 


9 P+ ur. (r—s)? 13. L/r?s+1/rs? 


14. Prove Proposition 1.3. 


15. If r and s are the roots of the equation x? + px + q = 0, what is the quadratic 


equation whose roots are r +s and rs? 


16. If r,s #0 are the roots of the equation x* + px + q =0, what is the quadratic 


equation whose roots are 1/r and 1/s? 
17. For what real values of @ are the roots of the equation x? +ax+a=0 real? 


18. For what values of m will the equation x? ~2x(1+3m)+(3+2m) =0 have equal 


roots? 


8 THE EARLY HISTORY 


Chapter Summary 


This introductory chapter was used to briefly review the solutions of the first- and second- 
degree equations in a single unknown. The history of the solution of the cubic equation 
was also discussed and the relationship of this formula to the complex number system 


was examined. 
Chapter Review Exercises 


Mark the following true or false. 
1. Every real number is the solution of some equation. 
2. Every pair of real numbers is the solution set of some quadratic equation. 


3. Every equation has at least one solution. 


New Terms 


cubic equation, 4 method of false position, 2 


first-degree equation, 2 quadratic equation, 3 


Chapter’ 2 


COMPLEX NUMBERS 


HROUGHOUT HisTorY, the introduction of new numbers has been greeted with 
T considerable resistance on the part of mathematicians. Legend has it that the 
discoverer of irrational numbers was rewarded by being drowned by his fellow Greeks. 
Be that as it may, the fact is that these numbers have been tagged with the pejorative 
label of irrational, a word which, when used in nonmathematical contexts, has definite 
derogatory connotations. The same, of course, applies to the negative numbers. The 
imaginary numbers have been cursed with what is arguably the worst nomenclature in 
mathematics. Given the considerable difficulties that the average students face in learning 
the rigorous discipline of mathematics, can they be blamed for balking at having to 
contend with quantities that mathematicians themselves admit are imaginary? 

The best way to overcome people’s resistance to a new concept is to convince them of 
its utility. Accordingly, it will be shown that the widening of our field of operations to 
include the complex numbers greatly enhances the power of the Ferro-Tartaglia-Cardano 
cubic formula. Next, the complex numbers will be used to solve some ruler-and-compass 
construction problems of plane geometry. Only in this chapter’s last section will the issue 


of the existence of the complex numbers be addressed. 


2.1 Rational Functions of Complex Numbers 


Just as was done by the mathematicians of the eighteenth and nineteenth centuries, we 
assume here the existence of a number i which has the property that i? = —1. 

The rigorous proof of i’s existence is deferred to Section 2.6. In the meantime, the 
number i is to be treated just like a variable, with the sole additional stipulation that 


2 


whenever i* occurs within an algebraic expression, it can be replaced by —1. A complex 


number is an expression of the form a+ bi where a and 6 are any real numbers. When 


Introductory Modern Algebra, Second Edition. 9 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


10 COMPLEX NUMBERS 


6 =0 such a number is called an imaginary number and when 6 = 0 it is said to be real. 


These complex numbers can be added and subtracted as polynomials. Thus, 


(5 —3i) +(—2+5i) =5—3i1-2+ 5i=3 4.21, 
(5 —3i) —(-2 + 5i) = 5-314 2-5i=7—8i. 


The multiplication of complex numbers also resembles that of polynomials, except that 


each occurrence of i? is replaced by —1. Thus, 


(5 —3i)(—2 + 5i) = —10 + 251+ Gi— 151” 
=—10+31i—15(-1) 


=—10+31i+15=5+431i. 


The division of complex numbers mimics the well-known process of rationalizing de- 


nominators. Thus, 


5 — 3i 5 —3i 2-51 10-2514 Gi 15) 25 191 —25 19. 


“245i —2+51 2-51 (2G? St—=‘i HH SCiC«DSCD 


Surprisingly, all of these arithmetical operations can be given very interesting visual, 
or geometric, interpretations. To accomplish this, we represent each complex number 
a+ bi by the point (a, 6) of the Cartesian plane. The point (a, 6) is called the Cartesian 
representation of the complex number a + 6i. Given two complex numbers a + 6i and 
c + di, let their Cartesian representations be P = (a, 6) and Q =(c,d) (Figure 2.1). 
Their sum 


(at bi)+(c+di)=(at+c)+(64+d)i 


is represented by the point R=(a+c,6+d). However, 


b+d)-b d 
slope of PR = sata ah = — =slope of OQ 

(a+c)-a oc 

7 b+d)-d 6 
slope of QR = ceed cd = — =slope of OP. 

(at+c)-c¢ a 


RATIONAL FUNCTIONS OF COMPLEX NUMBERS 11 


R(at+c,b6+d) 


Figure 2.1. Complex addition 


Consequently, PR || OQ and QR || OP andso OPRQ isa parallelogram. Thus we see 
that the addition of complex numbers resembles that of vectors. These considerations are 


summarized as follows. 


Proposition 2.1 Let O denote the origin of the Cartesian plane and let P and Q be the 
Cartesian representations of the complex numbers a + bi and ¢ + di, respectively. If the 


sum of the two complex numbers is represented by the point 2, then the quadrilateral 
OPRQ isa parallelogram. 


To give the multiplication of complex numbers a visual interpretation, it is convenient 
to begin by establishing some conventions. In the sequel, the general complex number 
a+ bi will frequently be abbreviated as z. If either a or 6 is 0, it is omitted from a+ bi. 
Thus, 3+ 0i = 3 and 0—5i=—Si. 

Let P = (a, 6) be the Cartesian representation of the complex number z =a + 61 
(Figure 2.2). The modulus of z, denoted by |z|, is the length of the line segment OP. 
In other words |a + di| = Va? + b?. Thus, for example, 


|2 + 3i| = V2? +3? = 13, 
|3 —4i| = 3° + (4) =5, 


|—2i| = +/ 07 + (2)? =2. 


It is clear that the modulus of the real number a + 0i is just its absolute value. Thus 
the modulus should be regarded as the extension of the notion of absolute value to the 
complex numbers. The argument of z = a + 6i, denoted arg(z), is the counterclockwise 
angle from the positive x-axis to the ray OP where P is the Cartesian representation 


of z. As can be seen in Figure 2.3, the arguments of 3, 1+i, 3i, —2, —2—2i, and 


12 COMPLEX NUMBERS 


Pla, b)=at+ bi=z 


Figure 2.2 The argument and the modulus 


3—3i are 0, 2/4, 2/2, 2, 52/4, and 77/4, respectively. For our purposes here it is 
convenient to identify angles whose measures differ by the full angle of 27. Thus it will 
be convenient sometimes to regard 1 as having argument 27 or 4 rather than 0. ‘The 
reasons for this will become clear after we have discussed the geometrical interpretation 
of the multiplication of complex numbers. 

The argument 6 of the general complex number z = 4 + 6i is easily computed (Fig- 
ure 2.2) from the relation tan @ = 6/a, but the quadrant in which z lies must be taken 
into account. Thus 


( i) l= 
+1)=arctan-=-, 
arg i Z 


whereas 
—2 5 
arg(—2 — 2i) = 2 + arctan = a 
Observe that if the complex number z = a + 6i has argument @, then, by Figure 2.2, 
z=atbi=|e| Ga =i) = |z|(cos@ + isin 8). 
lz] [eI 
We refer to |z|(cos@ + isin @) as the polar form of z. For example, the complex numbers 
1+i, 5, i, and —2i have polar forms 2(cosz/4+ isin 2/4) , 5(cos0+isin0), cosa/2+ 
isinz/2 , and 2(cos3x/2 + isin32/2), respectively. On the other hand, the number 
3 + 4i has polar form 5(cos a+ isin) where a = arctan 4/3 = 53.13°. 
Just as the addition of complex numbers has a geometrical interpretation in terms of 
Cartesian coordinates, so their multiplication can be easily visualized in terms of their 


polar forms. Let the complex numbers z and w be given in terms of their polar forms 


z= |z|(cos@+isin@) and w =|w|(cosp+t ising) . 


RATIONAL FUNCTIONS OF COMPLEX NUMBERS 13 


Figure 2.3. Some complex numbers 


‘The trigonometric formulas for the functions of the sums of two angles yield 


zw =|z||w|(cos6 + isin 6) (cose + ising) 
= |z||w|[(cos @ cosp— sin @ sing) + i(cos6@ sing + sin 6 cos@)] 


= (z||w|[cos(6 + g) + isin(@ + @)]. 


Thus, the product zw has polar form |z||w|[cos(@ + @) + isin(@ + @)]. Since it follows 
from Figure 2.2 that every complex number is completely determined by its argument 
and modulus, we conclude that arg(zw) = 6+ ¢ and |zw|=|z||w|. Hence, we have 
proved the following theorem. 

Theorem 2.2 Let z and w be any two complex numbers. Then arg(zw) = arg(z) + 
arg(w) and |zw| =|2||w|. 


If z =i and w=1+i, then zw =i(1+i)=—1+i and so 


3x 
arge(z)+arg(w) = rae are(zw) 


and |z||w|=1-/2= V/2=|zw]. 


14 COMPLEX NUMBERS 


Angles whose measures differ by integer multiples of 27 are considered to be identical. 


‘Thus, the number i has all the following as its arguments: 


—7x —3nx x 5x On 
aa ee ey 
This is necessitated by such observations as the fact that 


2/2 = arg(i) = arg[(—1)(—i)] = arg(—1) + arg(—i) = 2 + 3a /2 = 52/2. 


This fact, which appears to be a nuisance at this point will in fact turn out to be very useful 
in the next section. Some more light will be shed on this in Section 9.4. Theorem 2.2 
is commonly referred to as the argument principle. It has many interesting and useful 
consequences. For example, it clearly implies that arg(z”) = 2arg(z) and |z?| = |z|?. 
| at 


Similarly, arg(z*) = 3 arg(z) and |z?| =|z|?. In fact, a simple induction procedure yields 


the following observation. 


Corollary 2.3 If z is any nonzero complex number and & is any positive integer, then 


arg(z*) = karg(z) and |zk| =|z|k. 


This observation can be put to good use in computing large powers of complex numbers. 


Consider the problem of computing (1 +i). By Corollary 2.3, 
arg[(1 + i)!0°] = 100arg(1 +i) = 100-(x/4)=252=2 


and |(1 + i)! = [1 + if!0° = (2)! = 2°°. Hence, (1 +i)! = 2 (cosz + isinz) = 
250 ; 

Corollary 2.3 also holds for nonpositive exponents if we define z° = 1 for all z and 
z* =(1/z)* for z #0 and & = 1,2,3,.... 

The proof of this fact is relegated to Exercise 2.1.29. Exercise 2.1.28 calls for proving 
that integer powers of complex numbers obey the same rules as do the more familiar 
powers of real numbers, to wit, z”z” = z”*” and (z”)" =z". 


If zg =a + bi is any complex number, where a and 6 are real, we define Z, the conjugate 


of z, to be a— bi. Thus, —2 + 3i=—2—3i and 3—41=3+4i. 


EXERCISES 15 


Theorem 2.4 If z =a+ bi is any complex number, 
(a) z and Z are symmetrical with respect to the x-axis; 
(b) zz =|z|*; arg(zzZ) = arg(z) + arg(z); 
() Fw =24+0; Zw =20;F'=2"; 


(d) Z=z ifand only if z is real. 


Proof. See Exercise 2.1.27. a 


Exercises 2.1 


Find the argument and modulus of each of the complex numbers in Exercises 2.1.1 


to 2.1.4 
I. 243i 3. —3-4i 
2. 3-2i 4. —147i 


Express the complex quantities in Exercises 2.1.5 to 2.1.21 in the form a+ 6i , where a 


and 6 are real numbers. 


5. (2+3i)+(5—i) 14. (a+ bi)/(a—bi)—(a— 6i)/(a + bi) 
6. (17—3i)+(2+3i) 15. (2—i)?/(1+i) 
7 (2+3i)5-i) 16. (1+i)4 
g. (17—3i)—(2+3i) Eat 
9. (24 3i)(5—i) ae 
10. (17—3i)(2+3i) a ia 
. ; 19. j4321 
11. (2+ 3i)/(5—i) 
12. (17—3i)/(2+3i) 20. ((1—i)/(1 +i)!” 
13. (/3+5i)/(2— V3i) 21. i*"* (n is an integer) 


Solve the equations in Exercises 2.1.22 to 2.1.25 for z and w: 
22. (1+2i)2+5=0 

23. (Iti)je+5i= —2 

24. ig—-w=I1+tiand(1+i)zt+iw=1 

25. (1—i)z+iw =i and 2z—-(2+i)w=1 


26. Prove that if z and w are any two complex numbers, then |z + w| < |z|+{wI. 


16 


27. 


28. 


29. 


30. 


31. 


32. 


33- 


34. 


35. 


36. 


37- 


COMPLEX NUMBERS 


Prove the following: 


(a) Theorem 2.4(a) (c) Theorem 2.4(c) 

(b) Theorem 2.4(b) (d) Theorem 2.4(d) 

(e) If a, 6, c, and d are any real numbers, prove that z is a root of the equation 
ax? + bx? +cx +d =0 if and only if Z is also a root of the same equation. 
Prove that if z is any complex number and m and 7 are any integers, then z”2” = 
Zz" and (z”)"=2"". 

Prove that Corollary 2.3 holds for every integer &. 

Prove that the three distinct complex numbers z,, z,, and z, are collinear if and 
only if there exists a real number A such that z, = (1—A)z, +4z,. (Hint: examine 
the expression (z, — 2, )/(z, — 2,).) 

Prove that if z and w are two complex numbers, then the distance between them 
equals |z — w]. 

Prove that the midpoint of the line segment joining the complex numbers z and 
w is (z+w)/2. 

Prove that the four complex numbers z,, z,, z,, and z, lie on either a common 


straight line or a common circle if and only if the number 


(2, oe 23)/(2; aa Zy) 
(2, = 23)/ (2, a. 2) 


is real. 


Let 2,, 2, 2,,and z, be four complex numbers such that |z,| =|z,| =|z2;| =|z,| = 
1. Prove that z,, 2), 2, and z, form a rectangle if and only if z, + z, + 2, +z, =0. 
Prove that the center of gravity of the triangle whose vertices are the complex 
numbers z,, 2,, and z, is (z, + 2) + 23)/ 3. (Hint: recall that the center of gravity 
of a triangle coincides with the intersection of its three medians.) 


Prove that if || = 1, then there is a real number 6 such that (1+ &)/(1—&) = 6i. 


Prove that if z = a+ bi, where a and 6 are real, then (|a| + |6|)/V2 < |z| < 
|a| + |2). 


COMPLEX ROOTS 17 


2.2 Complex Roots 


In the previous section the four arithmetical operations were extended to complex num- 
bers. Next we examine the process of finding roots of complex numbers. What, for 
example, is Vi? Before addressing this question, it behooves us to recall that even V1 
involves some ambiguities. Sometimes it is 1 and sometimes it is —1 or both +1. We 
therefore define 7/z, for any complex number z and for any positive integer 7, to be 
the set of all the complex numbers w such that w” = z. 

Returning to vi, let j be any complex number such that j? =i. Then 2arg(j) = arg(i) 
and [j|* = |i] = 1. Consequently, using arg(i) = 7/2, arg(/) = arg(i)/2 = 2/4 and since 
[7| is, by definition, positive, |j| = 1. Thus 

7m nm\ v2 v2 
a eas 


cos — +isin— }= —+ 
4 4 2 


Another square root of i is of course 


An alternate method for arriving at —] is to recall that arg(i) could also have been 


taken as 57/2, in which case we obtain the square root 


css eee Vn 
ee a aye SMe ae 


It can be easily verified by direct calculations that 


(P3) --F) 


2 2 2 2 


This procedure yields three different values for 1. For, taking as the argument of 1 
the successive values of 0, 27, 47, 6z,..., each of the elements of 1 must have as its 
argument one of the values 0, 27/3, 4/3, 2x,.... The modulus of 1 being 1, it follows 


that the modulus of ¥/1 must be the real cube root of 1 which is also 1. Hence we get as 


18 COMPLEX NUMBERS 


Figure 2.4 Complex roots of unity 


our cube roots of 1 the following numbers: 


1(cos0+isin0)=1-1+i-0=1, 


2n .. Qn 1 V3 
1 ea Vie =—--~+—i, 


3 2. "2: 

( 4n =) 1 ¥3 
1{ cos — +isin — } =—- — — i, 

3 3 2. 2: 


1(cos2z + isin2zx) = 1, 


It is clear that this list of cube roots of 1 will cycle through the same three values. The 
second root, the one with argument 27/3, is denoted by w. By Corollary 2.3, the third 
root, having double the argument of w and the same modulus of 1, equals ow”. Note that 
the Cartesian representations of these three complex cube roots of 1 form an equilateral 
triangle in the Cartesian plane (Figure 2.4). 

‘The same procedure yields all the 2-th roots of unity (roots of 1) for each positive 


integer 7. 


Theorem 2.5 Let 7» be a positive integer, and let { = cos(27/m)+ isin(2x/n). Then 
V1 ={1,0,07,0%,....07 7}. 


Proof. By Corollary 2.3, £” = cos22+ isin 2a = 1, and so ¢ is indeed one of the elements 


of 1/1 . Moreover, for any integer &, 


(fy = (orf = =1, 


COMPLEX ROOTS 19 


and hence each ¢* is indeed an n-th root of unity. Since arg(¢*) = 2xk/n, it follows 
that the numbers 1,%,27,...,¢”! all have distinct arguments and so they are all dis- 
tinct complex numbers. The remainder of the proof, that there are no other roots of 
unity, is relegated to Exercise 2.2.29. We note that this also follows from the fact (see 


Proposition 6.8) that a polynomial equation of degree ” has at most 7 roots. 7 


The 7-th root of unity 
2x 2x 
€=cos — +isin — 
n n 
will be referred to as the first n-th root of unity. It is clear that |Z| = 1, and hence, by 
Theorem 2.2, |¢*| = 1 for every integer &. Since the angle subtended by the consecutive 


#+l and ¢* at the origin equals 


get 
ae( ZF ) =s@)= — 


and is independent of &, it follows that the elements of 1 form a regular n-gon that 


roots ¢ 


is centered at the origin. This fact is crucial for the next section, and so we state it as a 


proposition, 


Proposition 2.6 For any fixed integer » > 3, the Cartesian representations of the ele- 


ments of ¥/1 form the vertices of a regular n-gon. 
For example, since 27/4 = 2/2 and 


uv . . T . . 
cos — +isin-— =0+i=i, 
2, 2 


it follows that 
V¥1={1Lii2,%}={1i-1,-i}, 


the elements of which form the vertices of a square (Figure 2.4). 


Similarly, since 27/6 = 2/3 and 


it follows that 


V1 = {1,-0, (~0??, (0, (—0?)§, (-w0)° } = {1,0 @, “1, 0%, 0} 


20 COMPLEX NUMBERS 


the elements of which form the vertices of the regular hexagon of Figure 2.4. 
The following proposition is both a natural extension and a corollary of Theorem 2.5. 
Since our main interest lies in the roots of unity, its proof is omitted and relegated to 


Exercise 2.2.18. The subsequent example clarifies the purport of the proposition. 


Proposition 2.7 Let » bea positive integer and z any nonzero complex number with 


argument @. If 
2n 2x 


¢=cos — +isin—, 
n n 
then 
OO nO ees 
Vz =4 |Wz|( cos—+isin— }¢ | k=0,1,2,...,n—1 
n n 
where | 4/z| denotes the common modulus of all the elements of ¥/z. 


Since —1 +i has modulus 2 and argument 37/4, and since 


are i ! (1 +i) 
yaad saa a 1), 


V2 


it follows that 


rae) VAP) 2} 


2 


{ 1+i l+i +i ; 

=) sR FO HR f- 

v2 v2 ¥2 

It should be noted here that calculators are very handy in computing complex roots too. 


Thus, to compute 2 + 3i we note that 
4 
vy |2+3i| = V V 27 +3? = 713 © 1.378 


and 


1 1 3 
arg ( v2+ 3i) = Z arg(2 + 3i) = Z arctan (5) ~ 14.077°. 


Hence, if we set 


3 2 f1 3 . 
w = cos | 14arctan 5 +isin gan 5 ~ .970 + .243i, 


EXERCISES 21 
then 


V2.4 3i% 1.378 { w, wi,—w,—wi} 
= 1.378 {.970 +.243i, —.243 + .970i, —.970 —.243i, .243 —.970i } 
w {1.336 + .335i, —.335 + 1.3361, —1.336 —.335i, .335 — 1.336i}. 


The solution of the quadratic equation detailed in Chapter 1 works for complex 


coefficients as well. Accordingly, the roots of the equation iz” + 2z —2i=0 are 


24 /2?-4.i-(-2i) -24 7-4 eee 


= mreveremeeennmne I | 


2i 2i 2i 


We conclude this section with a curious fact that will shortly prove unexpectedly useful. 


Proposition 2.8 For any fixed integer 7 > 1, the sum of the elements of V1 is 0. 


Proof, Let be the first n-th root of unity. By Theorem 2.5, the elements of /1 can be 


listed as 1,2,27,...,¢”-!. The formula for geometric progressions now yields 
1-¢” 1-1 
14f4 744g grt ae =— =0. . 


te hee 
Exercises 2.2 


Express each of the elements of the sets in Exercises 2.2.1 to 2.2.12 in the form a+ 6i, 


where a and 6 are real numbers. 
nm vt 7 Viti 
2 VI 8. vi 
3. V-l 9 
4 Vi 10. /100—37i 
5. 3-4i ur. Ve*7—14 2ci 
6. V¥8—30i 12. 4/4ed —2(c? —d?)i 


13- Resolve the following paradox: 


l=V1= y(-1)(-1) = V-1v-1=i-i=? =-1. 


22 COMPLEX NUMBERS 


Find the complex solutions of the equations in Exercises 2.2.14 to 2.2.17. 


14. z°-6z+9+42i=0 16. z°—(1+i)z+5i=0 
15. (1—2i)z?+2z2+1=0 17. 27 +3(1+i)z—(2—3i)=0 


18. Prove Proposition 2.7. 


Prove the identities in Exercises 2.2.19 to 2.2.22. 

19. (1 +a*)'6 = 

20. (3+5w+ 3’)? =512 

21. +h =(at b\at bw)at bw’) 

22. (at b+cllat bot coat bw’? + cw) =a + 6? +03 —3abc 
If ¢ is an »-th root of unity, simplify ¢* for the values of » and & specified in Exer- 
cises 2.2.23 tO 2.2.26. 

23. n=10,k=135 25. n=999, k= 12,345 

24. n=135,k=999 26. n= 12,345, k = 106 


27. Prove that z € v1 ifand only if Z € V1. 


28. Prove that for every positive integer 2 > 1, 
ia 2k 2. 2nk 
> cos—— =0= > | sin—. 
k=l ‘i k=l e 


29. Complete the proof of Theorem 2.5 by showing that the given list contains all the 


n-th roots of 1. 


30. Prove that three complex numbers A, B, and C forma counterclockwise equilateral 
triangle if and only if A+ Bw + Cw? =0. What equation characterizes clockwise 
equilateral triangles? 

31. Suppose ABC is any triangle in the plane. Let A’, B’, and C’ be three points 
such that A’BC, AB’C, and ABC’ are all clockwise (or all counterclockwise) 
equilateral triangles. Prove that the centers of triangles A’BC, AB’C, and ABC’ 


also form an equilateral triangle. 


SOLVABILITY BY RADICALS I 23 


2.3 Solvability by Radicals I 


We now have sufficient tools at our disposal to formalize the notion of an algebraic solution 
of an equation 


n n—1 = 
Ax" tax" +---+4,  xt+a,=0 


where a, 4,,...,4, are any complex numbers. A solution of this equation is of course 
another complex number r such that a4)r” +4, rth 4,_,7+4, =0. The value of 
the solution r clearly depends on the coefficients a, 2,,...,4,,, and the solution is said to 
be algebraic if this dependence involves only radicals and the four arithmetic operations. 
More precisely, let Z denote the set of integers and let V be any set of complex numbers. 
The complex number z is said to have an algebraic expression in V if there exists a sequence 
of complex numbers z;, z,..., 2, = % such that for each i = 1,2,..., either 2, € »/Z_7 
for some positive integer m > 1 or else the number z, is obtained by adding, subtracting, 
multiplying, or dividing some elements of ZU V U { z,, z,...,%;_, }. Thus, each of the 


solutions of the quadratic equation ax? + bx +c =0 (where a #0), namely 


_ —b+Vb* —4ac 


Z= 
2a 


has an algebraic expression in { a, b, c } because it is possible to choose 


_ 22 = = Sips we 
z, = 6°, 2, =ae, 2, = 42), 24 = By — Zz, 


Bs € 4/2, %=—b+2, 2 = &,/a, Z= 2% =2,/2. 


If the complex number z has an algebraic expression in V whose only radicals are square 
roots, then we say that z has a degree 2 algebraic expression in V. Thus, the above 
solutions of the quadratic equation clearly have degree 2 radical expressions in { a, 6, ¢ }. 
If no radicals appear in an algebraic expression of z in V,, then z is said to have a rational 
expression in V. For example, if ¢ # —1, then (2—@6)/(1 +c) has a rational expression 
in { 4, b,c} with n =4 where z, = ab, z,=2—z,, 2,=1+c,and g=%,=2,/z,. If 
z has an algebraic expression in V, and V happens to be empty, we shall say that z has 


an algebraic expression in the integers. Finally, we say that an equation 


n n—1 n—2 a 
Ayx” +a x" +a,x"~ +---+4,=0 


24 COMPLEX NUMBERS 


is solvable by radicals ox algebraically resolvable if each of its roots has an algebraic expression 
in the coefficients { 4), 4,,4),...,4, }. Thus, the quadratic equation ax? +bx+c=0is 
solvable by radicals, as is easily verified by examining the quadratic formula above. The 
equation 


x) +ax> + bx*+ax+1=0 (2.9) 


is also solvable by radicals. To see this we observe that 0 is not a root of this equation so 


that it can be divided by x? and its terms can be regrouped as 


x x 


(++2) +e(«+2) 
x+—]) +a(x+—]+6-2=0. 
x x 


Setting w= x +1/x we note that u*+au+6—2=0 and x*—ux+1=0. Thus, x 


oe 1 
x°+—>+a(x+—}]+6=0 


or 


has an algebraic expression in { « }, and w, in turn, has an algebraic expression in { a, 6 }. 
‘This, of course, means that every solution of Equation 2.9 has an algebraic expression in 
its coefficients. 
The following observations are easily proved by induction on the minimum number 
of equations needed to express a number v as an algebraic expression of a set W: 
« If x hasa rational expression in V and each element v of V hasa rational expression 
in the set W, then v has a rational expression in the elements of W. 
« If x has an algebraic expression in V and each element v of V has an algebraic 
expression in the set W, then v has an algebraic expression in the elements of W. 
« If x has an algebraic expression in V and each element v of V has a rational 
expression in the set W, then v has an algebraic expression in the elements of W. 
« If x has a rational expression in V and each element v of V has an algebraic 


expression in the set W, then v has an algebraic expression in the elements of W. 
Exercises 2.3 


Decide whether the expressions in Exercises 2.3.1 to 2.3.11 are rational or (degree 2) 
algebraic expressions in { x, y, z }. 
tr. (x+y—1)/(x—z+2) 3. (xt y—1)/(x-—z+ V2) 


2, (x a/y =A) (e242) 4- (x +)!" 


o> oN 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


EXERCISES 25 


1910" 9. w+16x +123 (where w* = 1) 
x* 

10, VytZ 
10% 


i/V¥ «+ 2y— V32 Il. sinx 


. Explain why the solutions of the simultaneous equations 


ax+by+c=0 dxtey+f=0 


have rational expressions in { a, b,c, d,e, f } whenever ae # bd. 


Explain why the solutions of the simultaneous equations 
axtby+c=0 dxt+tey+f=0 


need not have rational expressions in { a, b,c, d,e, f } when ae = bd. 


Explain why the solutions of the simultaneous equations 


xtytxtyta=0 


x+y? +2x4+3yt+b6=0 


have degree 2 algebraic expressions in {a,b} . 

Prove that the equation x4 + 3x? — x? +3x + 1 =0 is algebraically resolvable. 
Prove that the equation x° — 2x° — 4x3 — 2x + 1 =0 is algebraically resolvable. 
Explain why the solutions of the equation x* + ax* + 1 =0 have degree 2 algebraic 
expressions in {a}. 

Explain why the solutions of the simultaneous equations x° + y® =a and x8 y’ = b 
have degree 2 algebraic expressions in {a,b}. 

Explain why the solutions of the simultaneous equations x* + y® = a and x44 = 6 
have degree 2 algebraic expressions in { a, b }. 

Explain why the solutions of the simultaneous equations x? + y? =a and x° y> = 6 
have algebraic expressions in { a, 6}. 

Show chat the solution of the equation x° + x° +.x44+x3+x?+x+1=0 can be 


reduced to the solution of some quadratic and cubic equations. 


26 COMPLEX NUMBERS 


22. Prove that the roots of the equation (A+ C —B)x* +2Cx +(B+C—A)=0 have 
rational expressions in { A, B, C }. 

23. Prove that the roots of the equation ABC?x? + (3A? + B*)Cx — (6A? + AB — 
2B*) =0 have rational expressions in { A, B, C }. 


2.4 Ruler-and-Compass Constructibility of Regular Polygons 


The ancient Greek mathematicians, who invented what we have come to call Euclidean 
geometry and the notion of a rigorous proof, bequeathed their successors a host of 
unsolved mathematical problems. Best-known amongst these are the questions of whether 
it is possible to trisect an angle, double a cube, or square a circle by means of a compass and 
an unmarked ruler alone. Here we treat a lesser-known, but equally natural, construction 
problem, namely, what regular polygons are constructible by ruler and compass alone? 
The other three problems are discussed informally at the end of the section. 

The ruler-and-compass constructions of the equilateral triangle, the square, and the 
regular hexagon are standard fare in the high school curriculum. That the regular pen- 
tagon is also so constructible is true, but not so widely known. This is proved below in 
Proposition 2.14. A regular octagon is easily constructed by inscribing a square in a circle 
and then drawing the two diameters that are perpendicular to the sides of the square 
(Figure 2.5). In general, it is clear that given any regular 2-gon it is possible to derive 
from it a regular 27-gon by drawing radii perpendicular to its sides. Hence the regular 
n-gon is constructible for n = 2”*?, 3-2”, and 5-2” for m=0,1,2,.... Ifa regular 
pentagon and an equilateral triangle are inscribed in a circle so that they share a vertex, as 
in Figure 2.6, then arc AB is 2/5 —1/3 = 1/15 of the total circumference of the circle. 
It follows that the regular 15-sided polygon is also constructible by ruler and compass. 


This information is summarized as the following proposition. 


Proposition 2.10 ‘The regular »-sided polygon is constructible by ruler and compass 
for n = 3, 4,5, 6,8, 10, 12, 15, 16. 

In 1796, Gauss proved that the regular 17-sided polygon is also constructible by ruler 
and compass alone, and we shall discuss in detail a proof that appears in his Disquisitiones 
Arithmeticae of 1801. Before that, however, it is necessary to make some general remarks 
about ruler-and-compass constructibility. We shall work in the Cartesian plane, so it will 
be assumed that a certain line segment has been designated as the unit segment. For the 
sake of brevity it will henceforth be said that a configuration is constructible when it is 


constructible by ruler and compass alone. A real number a is said to be constructible if it 


RULER-AND-COMPASS CONSTRUCTIBILITY 27 


Figure 2.5 | A square and a regular octagon 


A 


& 


Figure 2.6 An equilateral triangle and a regular pentagon 


is possible to construct a line segment whose length is |e| times the length of the unit 


segment. 


Proposition 2.11 The point (x, y) is constructible if and only if its coordinates x and 


y are constructible real numbers. 


Proof. The nature of Cartesian coordinates is such that it is possible to pass from a point 
to its coordinates and vice versa by means of straight lines that are perpendicular to the 


axes. Since such perpendiculars are well known to be constructible, we are done. ) 


It is obvious that the number 1 is constructible and it follows from elementary Eu- 
clidean geometry that if « and B are constructible real numbers, so are a+ G. In particular, 
every integer is constructible. The next lemma will provide us with a host of real con- 


structible numbers. 


Lemma 2.12 If « and B are nonzero constructible real numbers, then so are their 


product @, their quotient 8/a, and the square root 4/{a|. 


28 COMPLEX NUMBERS 


Figure 2.7. The multiplication of constructible numbers 


Proof. It is clear that we may restrict attention to positive a and @. On the positive x - 
and y-axes (Figure 2.7) let B, C, and D be points such that the lengths of the segments 
OB, OC, and OD are 1, a, and B, respectively. Using a standard ruler-and-compass 
construction, draw through D a straight line that is parallel to BC and intersects OY 
at E. Since AOBC and AODE are similar, it follows that OD/OB = OE/OC, or 
B/1 = OE /ca, and so the constructible line segment OE has length af. 

In view of the above argument it suffices to show that 1/« is constructible. On the 
positive x- and y-axes (Figure 2.8) let B, D, and E£ be points such that the lengths 
of the segments OB, OD, and OE are 1, a, and 1, respectively. Join the points D 
and £ and draw a line through B that is parallel to DE. If this line intersects OY at 
the point C, then, because of the similarity of AOBC and AODE, it follows that 
OD/OB = OE/OC or a@/1=1/0C, and so the constructible line segment OC has 
length 1/a. 

Let A, D, and B be three collinear points such that the line segments AD and DB 
have lengths and 1 respectively (Figure 2.9). Let C be the intersection of the line 
perpendicular to AB at the point D with the semicircle that has AB as its diameter 
(all these are well known to be constructible). The triangles ACD and CBD are right 
triangles each of which also shares an acute angle with the right triangle AABC. Thus, 
AACD and ACBD are both similar to AABC and hence they are also similar to 
each other. Consequently, AD/C D = CD/BD, or a/CD=CD/1, and hence the 


constructible segment CD has length a. . 


It follows from Lemma 2.12 that every rational number is constructible, as is every real 
number that has a degree 2 algebraic expression in the integers. A complex number is 


said to be constructible if its Cartesian representation is a constructible point. The above 


RULER-AND-COMPASS CONSTRUCTIBILITY 29 


D 
x 
Qo =~" 


a 


Figure 2.8 The reciprocal of a constructible number 


C 


A D B 


Figure 2.9 The square root of a constructible number 


geometric proposition results in an algebraic description of some constructible complex 


numbers. 


Corollary 2.13 If the complex number z = x + iy has a degree 2 algebraic expression 


in the integers, then it is constructible. 


Proof. Because of the recursive nature of the definition of algebraic expressions it suffices 
to show that if z =x +iy and w = u + iv are constructible complex numbers, then so 
arezt+w,z—w, zw, z/w,and /z. However, if z and w are constructible complex 
numbers, then, by Proposition 2.11 and Lemma 2.12, so are zt+w=(x+u)+i(y+v), 


Zw =(xu—yv)+i(xvt yu), 


1 x y 


— = ——__ —j —_., 
z x? + y? x? + y? 


and z/w =2(1/w). 

To argue the constructibility of ./z we note that it consists of the intersections of 
the circle centered at the origin and of radius |z| with the straight line that bisects the 
argument of z. Since |z| is constructible by Lemma 2.12, and angle bisection is a 


well-known ruler-and-compass construction, we are done. a 


30 COMPLEX NUMBERS 


The converse of this corollary also holds, and its proof is relegated to Exercise 2.4.28. 
Our approach to proving the constructibility of the regular pentagon and the regular 
17-sided polygon is based on the observation that the elements of ¥/1 form a regular 
n-gon centered at the origin of the Cartesian plane. By Corollary 2.13 it suffices to show 
that each of the vertices of these polygons, when regarded as a complex number, has a 
degree 2 algebraic expression in the integers. 
Proposition 2.14 ‘The regular pentagon is constructible by ruler and compass alone. 
Proof. Let denote the first 5-th root of unity. Since the remaining roots are e”, e*, e4, 
and 1, it follows from Lemma 2.12 that it suffices to prove that ¢ has a degree 2 algebraic 


expression in the integers. By Proposition 2.8, ¢ satisfies the equation 
ete?t+e+e4=-l. 
Set A=e+e‘ and B =e? +¢°, and note that 
A+B=etet+e +2 =-1 
and 
AB=(e+ eV +e)aPtetteote’=Pt+ettet+e*=-1. 
Hence, by Proposition 1.3, A and B are the solutions of the quadratic equation x? + x — 


1 =0, and so A has a degree 2 algebraic expression in the integers. On the other hand, 


i 1 
A=ete*=et+-, 
€ 
so that e? — Ae + 1 =O and hence ¢ is a solution of the quadratic equation x? — Ax +1=0. 
It follows that ¢ has a degree 2 algebraic expression in A, which in turn has a degree 2 


algebraic expression in the integers. Hence, by Corollary 2.13, ¢ is constructible. ] 


Exercise 2.4.1 calls for the explicit description of the coordinates of ¢ as degree 2 alge- 
braic expressions in the integers. Exercise 2.4.29 calls for a ruler-and-compass construction 


of the regular pentagon. We now turn to the regular 17-sided polygon (Figure 2.10). 
Theorem 2.15 The regular 17-sided polygon is ruler-and-compass constructible. 


Proof. Let {= cos(2x/17) + isin(27/17) be the first 17-th root of unity. Just as was the 


case for the regular pentagon, it suffices to show that ¢ has a degree 2 algebraic expression 


RULER-AND-COMPASS CONSTRUCTIBILITY 31 


Figure 2.10 ‘The regular 17-gon 


in the integers. This will be accomplished by listing a sequence of elements, the last being 
¢, such that each member of the sequence has a degree 2 algebraic expression in the 
previous ones, and the first has a degree 2 algebraic expression in the integers. 


We begin with the not-at-all-natural observation that by Proposition 2.8, 
CEP ae Eee ee ae ee Pa pry ae ee are ere, 
Next, set 


Aa=l4 747 475 47184 8 4 et 72, 
Basra Se clla cl yg p74 7124 96, 
CH=eesh eee et 
Dale 4784.22, 
B=B4 S474 472, 
FaCM eae 426, 
Gal+oe=o+e, 
H=eB + e4, 
It is not at all clear at this point what the pattern is for writing out the imaginary 17-th 


roots of unity in the first of the above equations, but once that is taken for granted, the 


pattern for forming the subsequent sequences is quite obvious. 


32 COMPLEX NUMBERS 


Since G=£+ 71, it follows that ¢*- G¢+1=0, and hence ¢ has a degree 2 algebraic 
expression in G. Next, G+H = c+cl64 734 24=C and 


Hence, G and H are the roots of x7 -—Cx +E =0. 

Again, by making use of Proposition 2.8 it is easily verified (Exercise 2.4.3) that 
C+D=A,CD=-1, £E+F =B,and EF =—-1, from which it follows that C and D 
are the roots of x7 -Ax —1=0 and E and F are the roots of x? — Bx —1=0. Finally, 
leaving some details to Exercise 2.4.4, we see that A+ B =—1 and AB =—4 so that A 
and B are the roots of x? +x —4=0. Thus, 

« A and B have degree 2 algebraic expressions in the integers, 
« C and E have degree 2 algebraic expressions in A and B, 
" G has a degree 2 algebraic expression in C and EF, and 
* © has a degree 2 algebraic expression in G. 
Consequently, ¢ has a degree 2 algebraic expression in the integers. Thus, the Cartesian 


representations of & and all of its powers are constructible. a 


While the above proof demonstrates the ruler-and-compass constructibility of the 
regular 17-sided polygon, it sidesteps the issue of actually constructing the vertices. The 


fact that the abscissa of the point ¢ that is used in the proof of the above theorem is 


1 1 1 
aaa! tae ve 17 
1 
+3V 17+3V17—-— Vv 34—2V17—-2¥V 344+2V17 


clarifies the reasons behind this avoidance. 


Gauss did not limit himself to the 17-gons. He went on to prove that for each 
prime number p the equation x? — 1 =0 has a resolution of the type exhibited in 
Proposition 2.14 and Theorem 2.15. More specifically, Gauss demonstrated that the 
roots of x? — 1 =0 have algebraic expressions in the integers that call only for radicals 
of the type </x where q is any prime factor of p — 1. In particular, if p —1 is a power 
of 2, as is the case for p =5 and p = 17, then these roots all have degree 2 algebraic 
expressions in the integers, as illustrated by Proposition 2.14 and Theorem 2.15. 

This, of course, means that for such p the regular p-sided polygon is constructible by 


ruler and compass. It so happens (Exercise 2.4.5) that if p is a prime such that p—1 isa 


RULER-AND-COMPASS CONSTRUCTIBILITY 33 


ower of 2, then p = 22” +1 for some integer m. Unfortunately, only five such values 
P P 8 y: y 


of p are known, namely 
3=27 41, 532741, 17=2% 41, 25752741, 65,537 =27 +1. 
The next number in this sequence, 
2” +1 = 4,294,967,297 


is not a prime since it equals 641 - (6,700,417), as discovered by Euler in the eighteenth 


century. In fact, all of the numbers 
2" +1 


for m= 5,6,...,16 are now known to be composite. 

Gauss also claimed to have proved rigorously that when p is a prime that is not of the 
form 2” +1 (such as 7 or 11), the regular p-sided polygon is not constructible by ruler 
and compass alone. While he never wrote down his proof, this fact is now known to be 
true. 

‘The strange ordering of the 17-th roots of unity that was used in the proof of Theo- 


rem 2.15 merits some discussion. The series 
CHO $$ Cg CF 4 3 4 cI 4 cM 4 IO 4 Id 4 eB 4 7 4 rh 12 4 72 4 6 
is such that each root is the cube of the previous one. Thus, 
P=, OPar7=r%, 2. YP ae. 


The general implications of the existence of such a sequence, those which Gauss used to 
treat x? — 1 =0, lie beyond the scope of this text. However, it is necessary to point out 
that the existence of such a sequence cannot be taken for granted. For example, if instead 


of cubing each term we only squared each term, we would obtain the series 


CHE 4 te Fa 4 eB a 4 4..., 


34 COMPLEX NUMBERS 


which fails to contain all the imaginary 17-th roots of unity. That some such power 
does cycle through all the imaginary p-th roots of unity will eventually be established in 
Theorem 7.17. At this point we leave the determination of such powers to trial and error. 


For example, if p = 11, then squaring works, since 
a ae ee a Se as 
contains all the imaginary 11-th roots of unity whereas cubing does not work as the series 


CHORE Hee a 


comes up short. 

Three other constructibility problems were mentioned in this section’s opening para- 
graph. Using tools similar to those that were introduced here, together with some linear 
algebra, it can be shown that the ruler-and-compass constructions in question do not 
exist. Specifically, the squaring of a circle calls for the ruler-and-compass construction of 
a square whose area equals that of a given circle. In particular, squaring a circle of radius 1 
is tantamount to solving the equation 2-17 = x” and so, if successful, such a construction 
would imply that x = ./z has a degree 2 expression in the integers. However, in 1882, 
Ferdinand Lindemann (1852-1939) proved that z has no algebraic expressions in the 
integers whatsoever. Consequently, neither does ,/7z have such an expression and hence 
there is no general ruler-and-compass construction for squaring arbitrary circles. Num- 
bers like = that have no algebraic expressions in the integers are said to be transcendental 


numbers. Other transcendental numbers are e = 2.718281... and 
0.1234567891011121314151617181920212223.... 


The doubling of a cube calls for the ruler-and-compass construction of the side of a 
cube that has double the volume of a given cube. In particular the doubling of the unit 
cube, if successful, would prove that V2 is a constructible number. Again, it is known 
that v/2 has no degree 2 algebraic expressions in the integers, and so this construction 
too is impossible. 

Finally, the trisection of an angle calls for the ruler-and-compass constructions of an an- 
gle that is one-third of a given angle. In particular, if successful, such a construction would 


yield an angle of 20° = 60°/3. This in turn would imply that cos 20° is a constructible 


EXERCISES 35 


number. However, 
1/2 =cos60° = cos(3 - 20°) = 4.cos* 20° — 3 cos 20°, 


and so cos 20° is a solution of the cubic equation 4x? — 3x — 12 =0. The solutions of 
this equation can be shown to have no degree 2 algebraic expressions in the integers. 
Consequently, there is no general ruler-and-compass construction for trisecting angles. 

The credit for hammering the final nail into the coffins of the angle trisection and 
cube doubling problems in 1837 goes to the little-known French mathematician Pierre 
Wantzel (1814-1848). 


Exercises 2.4 


1. Show that if e = cos27/5 + isin27/5, then 


le 104275. 
4 4 1 


2. With ¢ as in Exercise 2.4.1, express e”, ¢*, and e in the form a + bi. 


3. If C, D, E, and F are as defined in the proof of Theorem 2.15, prove that 
CD=EF=-1. 


4. If A and B are as defined in the proof of Theorem 2.15, prove that AB = —4. 


5- Prove that if p is a prime integer such that p — 1 is a power of 2, then there exists 


an integer m such that p = 2" 41. 


6. For which of the integers 7 = 3,4,..., 100 can you assert that a regular 7-gon is 


constructible? 


7. For which of the integers » = 101, 102,..., 200 can you assert that a regular n-gon 


is constructible? 


Sketch diagrams that demonstrate the constructibility of the real numbers in Exer- 


cises 2.4.8 to 2.4.11. 


g. V2 Io. gies 
9. V5 Il. V2 


36 


COMPLEX NUMBERS 


Suppose you are given a triangle with constructible sides. Which of the following aspects 


of this triangle, listed in Exercises 2.4.12 to 2.4.20, can you assert to be constructible? 


Justify your answers. 


12. 
14. 
15. 
16, 
17. 
20. 


21. 


22. 


23. 


24. 


25. 


26, 


27. 


28. 
29. 


2.5 


the perimeter 13. the square of the perimeter 
the sum of the lengths of the medians 

the square root of the sum of the lengths of the medians 

the area 18. the cube of the area 

the square of the area 19. the cube root of the area 
the product of its area with its perimeter 


Prove that the area of the regular hexagon whose side is constructible is a con- 
structible real number. 
Prove that the area of the regular pentagon whose side is constructible is a con- 
structible real number. 
Explain why the equation x® + x4 +1=0 has no real roots and why each of its 


complex roots is constructible. 


1,024 4 


Explain why the equation x x°!? + 1=0 has no real roots and why each of 


its complex roots is constructible. 


Let 7 be the first 7-th root of unity. Prove that the quantities 7+? +74 and 


y +n° +7) are constructible. 


Let be the first 11-th root of unity. Prove that the sum of a, o, a4, &, and a 


is a constructible complex number. 
Find three distinct 13-th roots of unity whose sum is a constructible number. 
State and prove the converse of Corollary 2.13. 


Construct a regular pentagon by means of ruler and compass. 


Orders of Roots of Unity 


We have seen that the 4-th roots of unity are 1, i, —1, and —i and that the 6-th roots of 


unity are 1, —w”, w, —1, w”, and —w. However, —1 is already a square root of 1, and w 


and w* are also cube roots of 1. If ¢ is any root of unity, then the order of £, denoted 


by o(), is the least positive integer m such that [” = 1. Thus, o(1)=1, o(—1) = 2, 
o(w) = 3, o(—w) =6, o(i) = 4, and o(—w”) = 6. 


ORDERS OF ROOTS OF UNITY 37 


The following proposition on the order of roots may seem obvious, but it does require 
formal proof. The integer m is said to be a divisor of the integer 7 (and 7 is said to be a 
multiple of m) if there is an integer & such that 2 = km, denoted by m| x. An integer 
that is greater than 1 and whose only positive divisors are 1 and itself is said to be prime. 


An integer that is greater than 1 and is not a prime is said to be composite. 


Proposition 2.16 If is any complex root of unity and » is any integer, then [” = 1 if 


and only if 7 is a multiple of o(Z). 


Proof. If n is a multiple of o(Z), then there exists an integer m such that 2 = 0o(¢)m and 
hence ¢” = (¢°@))”" = 1” = 1. Conversely, suppose that 7 is an integer such that ¢” = 1. 
If n is positive then the process of long division yields integers g and r such that g > 0, 
o(f) > r 20, and n=0(f)q +r. But then 

” 1 


(¢%8))9 ig 


Since 0 < r < 0(Z) and 0(Z) is the least positive integer m such that ¢” = 1, it follows 


gr apr = 


that r = 0 and hence 2 = o(2)q. 
If 1 is zero, then it is trivially a multiple of o(Z). If 7 is negative, then —7 is a positive 


integer such that 


ed 
Thus, by the above considerations, o(Z) is a divisor of —” and therefore also of mn. 


‘The following corollary is an immediate consequence of the above proposition. Its proof 


is relegated to Exercise 2.5.17. 


Corollary 2.17 Suppose @ is a root of unity and a and 6 are any two integers. Then 
» ¢4=2° if and only if o(£) is a divisor of a— 6, and 
© 1,2,27,..., 0%! are all distinct. 


A primitive -th root of unity is one which is not an m-th root for any m <n. Thus, i 
and ~i are the only primitive 4-th roots of unity, and —w and —w” are the only primitive 
6-th roots of unity. On the other hand, w and w” are primitive cube roots of unity, 
and, similarly, every 5-th root of unity except 1 is also a primitive fifth root of unity. It 
is clear that % is a primitive n-th root of unity if and only if m = 0(Z). It also follows 
from Corollary 2.17 that ¢ is a primitive 7-th root of unity if and only if ” = 1 and 
the numbers 1,2, ¢?,...,277! are all distinct. In particular, the first 2-th root of unity is 


always primitive. 


38 COMPLEX NUMBERS 


We shall state and prove several more important facts regarding the orders of roots of 


unity in Section 5.2, after some necessary tools have been acquired. 


Exercises 2.5 


For each of the values of 7 in Exercises 2.5.1 to 2.5.8 list the elements of /1 with their 


orders. 
I 4 3. 6 5- 8 7 10 
2 5 4. 7 6. 9 8. 12 


For each of the values of 7 in Exercises 2.5.9 to 2.5.12 list the primitive elements of V1. 


9. 6 Io. 11 ur. 12 12. 24 


13. Prove that if ¢ is a root of unity, then o(f!) = o(Z). 


14. Prove that if Z is a primitive m-th root of unity, then so is ¢~'. 


"— are all distinct. 


15. Prove that if f is a primitive ”-th root of unity, then 1, f,27,.... 

16. Prove that if 7 is a positive odd integer and ¢ is a primitive #-th root of unity, then 
so is £?. Is this also true for even ? Justify your answer. 

17. Prove Corollary 2.17. 

18. Prove that if f #1 is any root of unity, then 1+ +07 +---+Z0(f)7! =0. 

19. Prove that if ¢ is any root of unity, then 1-¢-2?---0(¢)7! = (—1)%"!. 

20. Prove that if p is a prime number, then every imaginary p-th root of unity is 
necessarily a primitive p-th root of unity. 

21. Prove that if 2 > 4 is not a prime, then at least three of the -th roots of unity are 


not primitive #-th roats. 


22. Prove that if fe 7/1 and Ze V1, then (=+1. 


2.6 ‘The Existence of Complex Numbers 


This section is devoted to the construction of a number system whose ontological cre- 
dentials are impeccable and which is indistinguishable from the complex number system. 
An alternate proof of the existence of complex numbers is offered in Section 10.3 ina 
much wider and more useful setting. 

We begin by defining a Cartesian number as an ordered pair (a, 6) of real numbers. 


‘The two Cartesian numbers z = (4, 6) and w =(c, @) are considered to be the same, or 


THE EXISTENCE OF COMPLEX NUMBERS 39 


equal, if and only if a=c and 6 =d. Thus, (2,3) # (3,2) and (27, 3°) = (4,27). These 
Cartesian numbers can be thought of either as pairs of real numbers or as points of the 
Cartesian plane. 


The addition and multiplication of Cartesian numbers are defined as follows: 


(a, 6)+(c,d)=(a+c,b6+d), 
(a, b)-(c,d) =(ae—bd,ad + be). 


‘These definitions are motivated by the fact that the Cartesian number (a, 6) is supposed 
to be a logical construct that mimics the behavior of the intuitive quantity a+ 6i. Thus 
the addition and multiplication of Cartesian numbers mimic the facts that (a + 6i)+ 
(c+di)=(a+c)+(64+d)i and (a+ bic + di) =(ac— bd)+(ad + be)i. 

This, of course, is only the motivation for these definitions. From the purely logi- 
cal stance, these definitions need no justification. The addition and multiplication of 
Cartesian numbers can be demonstrated to possess all the usual properties that they have 
in the context of real numbers (these well-known properties will be formalized later in 
Section 6.1). Thus, the addition of Cartesian numbers is commutative because it inherits 


this property from the addition of real numbers in the following manner: 
(4, 6)+(c,d)=(atc,6+d)=(ct+ad+b)=(c,d)+(a, d). 
Similarly, the multiplication of Cartesian numbers is associative because 


[(a, b)-(c,d)]-(e, f)=(ac— bd, ad + bc)-(e, f) 
= [(ac —bd)e—(ad + be) f (ac — bd) f +(ad + bc)e] 
=(ace—-bde—adf —bcf,acf —bdf +ade+ bce) 
=(ace—ad f —bef —bde,acf +ade+bce—bd f) 
=[a(ce—d f)—b(cf +de),acf +de)+ b(ce—df)] 
=(a, b)-(ce—df,cf+de) 
=(a,6)-[(c,4)-(e, f)}- 


‘The remaining proofs of the commutativity, associativity, and distributivity of the addition 


and multiplication of Cartesian numbers are relegated to Exercises 2.6.1 to 2.6.3. We 


40 COMPLEX NUMBERS 


define the Cartesian zero to be the pair (0,0) and denote it by 0.. It is clear that if 


z =(a, 6) is any Cartesian number, then 
z+0 =(4,6)+(0,0)=(4, d) =z. 


We define the Cartesian unity to be the pair (1,0) and denote it by 1.. Note that for any 


Cartesian number z = (a, d), 
z:1.=(a, 6)-(1,0)=(4:1-6-0,4-0+6-l)=(4 6)=z. 


Finally, we address the existence of inverses. It is clear that (—a, —) is the additive inverse 
of (a, 6) in the sense that 
(a, b)+(-a,—-6) =0.. 


If (a, 6) #0,, then a” + 6? £0 and so 


a —b 
,d =U tp. aye = ep. 
ee) (=p ed 


is a well-defined Cartesian number and it can be verified (Exercise 2.6.4) that (c, 2) is 
the multiplicative inverse of (a, >) in the sense that (a, 6)-(c,d) = 1,. 

The foregoing discussion establishes that the set of Cartesian numbers together with 
the operations of addition and multiplication constitutes a bona fide number system. 
We now demonstrate that these numbers are just the complex numbers in disguise by 
identifying amongst them a copy of the real number system together with a quantity that 


behaves just as the imaginary number i is expected to behave. Observe that 
(a, 0)+(c,0)=(a+c,0+0)=(a+c,0) 


and 


(a,0)-(6,0)=(a-6—-0-0,2-04+ 6-0) =(a6,0). 


In other words, the Cartesian number (a, 0) behaves with respect to Cartesian addition 
and multiplication just as the real number a behaves with respect to real addition and 
multiplication. Thus the set of Cartesian numbers whose second coordinate is 0 is 


indistinguishable from the real number system. 


EXERCISES 41 
Let i, denote the Cartesian number (0, 1). Note that 
2 =(0,1)-(0,1)=(0-0—1-1,0-1+1-0)=(-1,0) 


and (—1,0) is the Cartesian number that corresponds to the real number —1. Moreover, 


if z is any Cartesian number (a, 4), then 


(a,0)+(6,0)i, =(4,0)+(4,0)-(0, 1) =(2,0)+(4-0-0-1,6-14+0-0) 
=(a,0)+(0, 6) =(a, 6) =z. 
This, of course, is the Cartesian analog of the fact that the arbitrary complex number z 
can be written in the form a+ bi where a and 6 are real numbers and i is a square root 


of —1. Thus, we have shown that the Cartesian numbers, whose existence is justified by 


definition, behave just like the complex numbers. 


Exercises 2.6 


1. Prove that the addition of Cartesian numbers is associative. 
2. Prove that the multiplication of Cartesian numbers is commutative. 


3. Prove the distributivity of Cartesian numbers. I.e., prove that 
(4, A) (c,d) +(e, f)] =(@ bcd) +(a bye. f). 


4. Prove that if (a, 6) #0, then 


( 0-( a —b )=1 
oe +b a+b) © 


5- Prove that if (2, b) #0. and (a, 6)-(x, y) =(a, 6) for some (x, y), then (x, y) =1,. 
6. Prove that (a, 6)-0.=0.. 
7. Prove that if (4, 6) # 0, and (a, 6)- (c,d) =0,, then (c,d) =0.. 


€ 


42 COMPLEX NUMBERS 


Chapter Summary 


This chapter began with an informal definition of the complex numbers and went on to 
a discussion of the four arithmetical operations and the extraction of roots in this new 
context, special emphasis being given to the roots of unity. These operations were used 
to give a formal definition of the concept of solvability by radicals. The geometry of 
the roots of unity was then used to prove the ruler-and-compass constructibility of the 
regular 17-sided polygon. This application relied on some surprising subtleties inherent 
in the powers of the roots of unity. The related notion of the order of a root of unity and 
some of its properties were expounded in the next section. Finally, a rationale justifying 


the existence of the so-called imaginary numbers was offered in the last section. 
Chapter Review Exercises 


Mark the following true or false. 
1. The sum of two complex numbers is a complex number. 
2. The sum of two imaginary numbers is never a real number. 
3- The numbers 0, 1+i, 2—i, and i form a parallelogram. 
4. arg{(1 + 2i)(1 —i)] = arg(1 —i) + arg(1 + 2i). 
§- (1 + 2i)!23| = |(1 + 2i)|!9. 
The number 1 has twenty 20-th roots. 


6 
7. The number 1 has nineteen 20-th roots of order 20. 
8. The elements of /1 forma regular heptagon. 

9 


If a £ 0, then the solutions of ax? +bx+c=0 havea degree 2 algebraic expression 
in {a, b,c}. 
10. The solutions of 3x? —7x + 11 =0 are constructible. 


11. The regular pentagon and the regular 17-sided polygon are the only regular polygons 


that are constructible. 
24 : eos 
12. The order of every element of ¥/1 is a divisor of 72. 


13. Ifo isa primitive 7th root of unity, then a, o?, o°, and &! are all distinct. 


New Terms 


algebraic expression, 23 
algebraic solution, 23 
algebraically resolvable, 24 
argument, II 

argument principle, 14 
Cartesian number, 38 
Cartesian representation, 10 
complex number, 9 
composite, 37 

conjugate, 14 
constructible, 26 


divisor, 37 


Supplementary Exercises 


CHAPTER SUMMARY 


doubling of a cube, 34 


imaginary number, 10 


modulus, 11 


multiple, 37 


order of a root of unity, 36 


polar form, 12 


prime, 37 


roots of unity, 18 


solvable by radicals, 24 
transcendental numbers, 34 


trisection of an angle, 34 


unit segment, 26 


43 


1. Write a computer script that will verify that the regular 257-sided polygon is 


ruler-and-compass constructible. 


2. Write a computer script that will compute the 7-th roots of any complex number. 


3- Is there an analog of Exercise 2.2.30 for squares and other regular polygons? 


4. Investigate the number system obtained by replacing the multiplication of Cartesian 


numbers with 


Which numbers have multiplicative inverses? What are the roots of unity like? 


(a, 6)-(c,d)=(ac + bd,ad + be). 


5- Make up your own number system and investigate it. 


Chapter 3 


SOLUTIONS OF EQUATIONS 


[ IS OUR PURPOSE here to discuss solutions of equations from several different points 
of view. We will touch on the issues of existence of solutions, existence of formulas, 


solvability by radicals, and computation of solutions. 


3.1. The Cubic Formula 


We are now in position to present the modern version of the Ferro-Tartaglia~-Cardano 


solution to the general cubic equation. 
Theorem 3.1 Every cubic equation is solvable by radicals. 


Proof. For the sake of simplification, we shall assume that the cubic equation we wish to 
solve has the form 


x +ax? + bx+c=0. (3.2) 


It is clear that every cubic equation can be reduced to this form. Next, the problem 
is further simplified by transforming it to a form that is free of the x” term. This is 
accomplished by a transformation of the type x = a+ y, where the value of e will shortly 


be specified. Substituting x = a+ y into Equation 3.2 we get 
(a+ yP+alatyPt+b(aty)+e=0 


or 


y? + (3a+a)y? + (307 + 2aat b)y +(e +40? + batc)=0. 


The choice of @ = —a/3 will clearly make the above coefficient of y? vanish. The equation 
now reduces to 
y+ pytq=0 (3.3) 


Introductory Modern Algebra, Second Edition. 45 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


46 SOLUTIONS OF EQUATIONS 


where 
36 —a* 27¢ +2a?—9ab 


3 27 


To solve the reduced cubic of Equation 3.3 we shall make another transformation: 


P and g= 


B 
yroert— 
z 


where the value of 8 will be chosen so that the resulting equation is solvable. When this 


value of y is substituted into Equation 3.3, we obtain 


ae) 
z+—)] +p(z+—]+q=0, 
z z 


32 3 
PTE: fF 
z 2 


or 


2° +(36+ p)etqt 0. 


Miraculously, the choice of 8 = —p/3 causes the coefficients of both z and 1/z to vanish, 
leaving us with the equation 


3 3 


24+q-5=0 or 2o+g °F =o, (3.4) 


This is a quadratic equation in z? with solutions 


—gtial ag? +t 
3 g=vg to. 


oS = 15 (3.5) 


It is clear that z has an algebraic expression in p and q. Hence, if z #0, each of the 
terms of the sequence p, g, Z, y, x has an algebraic expression in { a, 6,c}. If z =0, then 
y=2%+8/z fails to be an algebraic expression. However, in that case, by Exercise 3.1.10, 
x? +ax? + bx +¢=(x +a/3) +¢—a>/27 and the corresponding equation is clearly 


algebraically resolvable. a 


The proof of Theorem 3.1 yields six possible values for z from which one can obtain 
six possible values for x. We shall later see (Proposition 6.8) that a cubic equation can 
have at most three roots, and we now set out to select three of the above six values that 
always yield all the solutions of the given cubic equation. The proof that these are the 


“correct” three values is deferred to Section 6.4. If all of the values of z that arise from 


THE CUBIC FORMULA 47 


Equation 3.5 are 0, then (Exercise 3.1.10) 


a\3 
P+axt+bxte=(x+2) 
and so x = —a/3 is the triple solution of Equation 3.2. Otherwise, let z, be any nonzero 
value of z obtained from Equation 3.5 and set 2, =z, and z, = wz, For each 
2=1,2,3 set 


P a 
,=3,-— d x,=y,—--. 6 
Ji z; 3 an x; yi; 3 G3 ) 


2; 


Then { x,, x,, x, } is the complete solution set to Equation 3.2. 


Consider the cubic equation x? — 3x +2=0. Here p =—3, q =2, and 


ee See a 


2 2 
and so we can choose z, = —1. This gives 
; —3 
x= =-|-—- —— =-2Z, 
rr) 3(-1) 
(1-5 =A 
xX, = = o(-—1) —- ———- = -—w-o* = 1, 
2=Jo 3«@(—1) 
x3 = y igh aon aca ap 1. 
3 3 3e7(—1) 


We next turn to an example with complex coefficients—the equation x? + 3ix —(1+i)= 


0. Here p = 3i, q =—(1 +3), and 


~g+afq2+ ie i+,/(1 +i? + GE 
qty rts 2 EE (1+i)+ 5 
2 2 


1t+i+v1+2i-1-4i 
= 2 
14+i+1-i 
a 
=1. 


48 SOLUTIONS OF EQUATIONS 


Consequently, we may choose z, = 1 and so 


3i 
Agia 
xX) = J =1 laa ae hea wi, 
3i 
x3 = 73 = 1 Ora se — wi 


Finally, we examine an equation in which the coefficient of x? is not zero, namely, the 
equation x? —6x?—4=0. Here a=-—6, 6 =0, and c =—4, so that p =—12 and 
q =—20. Substitution of these values into Equation 3.5 yields z? = 10+ 6. Choosing 
Zi= v4 we then obtain 


-12 -6 3 

Poe ea f weenie ine, ie 70a 5 

3 3v4 3 

a; -12 -6 3 3 
Sip ta eS ee eyes 
2 I2 3 30/40 3 

a 3; —12 —6 3 3 
= y,—~ = V40? — —— — — 324 140? + 160. 


In conclusion we note that while the solution of the cubic equation given in Equa- 
tion 3.6 above is formally different from that which appears as Equation 1.5, it is not too 
difficult to show that this latter expression actually equals one of the three roots that are 


given by Equation 3.6. The details are relegated to Exercise 3.1.17. 
Exercises 3.1 


Use the cubic formula to find all the complex roots of each of the equations in Exer- 


cises 3.1.1 to 3.1.9. 


1. x°+9x—6=0 4. x —15x—126=0 
2. «°+12x—-12=0 §- x9 4+3x74+9x4+5=0 
3. x9 +18x—30=0 6. x? —6x?+24x-—44=0 


7 x3 4+3x?-3x+2i-5=0 
8. x°+6ix—1—8i=0 (Hint: Y—63 — 16i = +(1 —8i).) 


9. 2x343x74+3x4+1=0 


SOLVABILITY BY RADICALS II 49 


10. Prove that if all the values of z in Equation 3.5 are 0, then p = q =0 and x? + 
ax? +bxt+c=(x+a/3). 


Use Exercise 3.1.10 to solve Exercises 3.1.11 and 3.1.12. 
11. x? +3(1+i)x? + 6ix —2(1~i)=0 


12. x2 +G6ix? —12x—8i=0 


If x,, x), and x, are the solutions of the cubic equation x? + ax? + bx +c =0, then the 


quantity [(x, —x,)(x) — x3)(x3 —x,)]? is called the discriminant of the given equation. 
13. Prove that the discriminant of the equation y? + py + q =0 is 4p? —27q”. 
14. Prove that the discriminant of the equation x° + ax? + bx +c =0 is 18abc — 
4a>c +a’ b? —4b> —27c?. 
15. Prove that a cubic equation with real coefficients has three distinct roots if its 
discriminant D is positive, a single real root if D is negative, and at least two equal 


real roots if D is zero. 


16. Show that if a is real, then the equation x? + ax +2 =0 has three real roots if and 
only if a<—3. 
17. Show that the root of the cubic equation given in Equation 1.5 is also one of those 


given by Equation 3.6 of this section. 


3.2 Solvability by Radicals II 


Whereas thousands of years elapsed between the solutions of the quadratic and the general 
cubic equation, only a few more years passed before Cardano assigned the problem of 
finding a formula for the general fourth-degree equation, or quartic equation, to his disciple 
Lodovico Ferrari and the latter succeeded in solving it. Subsequently, several other 
mathematicians presented their own solutions of the quartic. From our perspective, the 
most significant of these is Lagrange’s solution which is presented in detail in Section 6.5. 

Quite naturally, mathematicians next turned their attention to the general quintic 
equation, or fifth-degree equation. This equation, however, presented new difficulties, 
and no substantial progress was made for another 250 years. During the second half of 
the eighteenth century some mathematicians recognized the fact that even the innocent 
looking equation 


x?—-1=0 (3.7) 


50 SOLUTIONS OF EQUATIONS 


presented them with challenges. They were aware that one of the solutions of the equation 
x'!_] =0 is the complex number 

2n |. Qn 

cos —— + Isin —, 

11 11 
and consequently this number can be expressed in terms of a radical of the 11th order 
(namely 71). However, in 1771 Vandermonde and Lagrange showed that the same 
number could also be expressed in terms of radicals of the 2nd and sth orders, thus 
uncovering some surprising relationships between radicals of the 2nd, sth, and 11th 
orders. 

Equation 3.7 came to be known as the cyclotomic equation because its zeros, as stated 
in Proposition 2.6, lie on a common circle. In 1801, in the concluding chapter of 
his Disquisitiones Arithmeticae, Gauss delved into the subtleties of the radicals that are 
required for the solution of the cyclotomic equation. The proof of Theorem 2.15 was 
meant to provide a sample of the algebraic manipulations that this task required. The very 
special nature of the cyclotomic equation notwithstanding, this work turned out to be 
very seminal. Evariste Galois, who completely settled the issue of algebraic resolvability of 
equations some 30 years later, made repeated references to Gauss’s techniques and results 


in explaining the motivation for his own work. 


3.3. Other Types of Solutions 


While this book’s main theme is the issue of solvability of polynomial equations by means 
of algebraic operations, it might be pedagogically advantageous at this point to discuss 


some other senses in which an equation could be solvable. Consider the equation 
x°—6x+3=0. (3.8) 
If we set f(x) = x? —6x +3, then 


lim f(x)=0co and lim f(x)=—oo. 

x—00 xX—~—00 
Since f(x) is a continuous function, it follows that its graph must cross the x-axis at 
some point and that point clearly yields a solution to Equation 3.8. This argument 


generalizes easily to the following: 


OTHER TYPES OF SOLUTIONS 51 


Theorem 3.9 _ If is an odd positive integer, and a,,a,,...,4, are real numbers, then 
the equation 
2 


n n—| n— ae 
x +a x ak t+ ae +a =O 


has a real solution. 


Thus, it is possible to argue the existence of a solution to an equation without having 
any information at all about the value of the solution. In fact, the mathematicians of the 
seventeenth and eighteenth centuries became convinced of the validity of the following 


sweeping statement: 


Theorem 3.10 Every equation of the form x” +.a,x"!+a)x"*+---+a, ,x+a,=0 


has a solution. 


The solution whose existence is guaranteed here may be complex, but it always exists. 
The first valid proof of this fact was provided by Gauss in 1796 and it eventually became 
known as the Fundamental Theorem of Algebra. Neither this proof, nor any of Gauss’s 
several subsequent alternate proofs provided information about the actual value of the 
solution; they were concerned only with its existence. In 1826, Niels Abel proved that 
no analogs of the Ferro-Tartaglia~-Cardano cubic formula could exist for the general 


fifth-degree equation, 
x taxt+ bx tex? +dxte=0. 


Shortly thereafter Evariste Galois constructed a general theory for determining just which 
general equations have formulas and which specific equations are solvable by radicals. 
Using that theory it is possible, for example, to show that the roots of Equation 3.8, the 
existence of at least one of which is easily demonstrated, do not have algebraic expressions 
in the integers. 

There is yet another aspect to solving equations that is essentially different from both 
the issue of existence and from that of expressibility by algebraic operations, and that 
is the problem of evaluating a root and writing it down as, say, a decimal number. Just 
because it is known that a certain number is an algebraic expression in the integers does 
not mean that we have any idea of its size. The root of Equation 3.8 whose existence was 
proven by a theoretical argument, of course, also suffers from the same lack of precision. 

There are numerical methods for finding roots of equations which are much more 
practical than even the Ferro-Tartaglia-Cardano formula. The best known of these, the 


Newton-Raphson method, is usually taught in the first semester of calculus but will never- 


52 SOLUTIONS OF EQUATIONS 


theless be reviewed here because we feel that this will bring about a better understanding 
of the difference between solvability of equations in general and solvability of equations 
by radicals. 
Loosely speaking, the Newton-Raphson method, when applied to an equation of the 
form f(x) = 0, says that if x, is an estimate for a solution, then 
F(*,) 


Xap =X (3.11) 


an f'(x,) 


where f’(x) denotes the derivative of f(x) with respect to x, is a better estimate for the 
same solution. Consider Equation 3.8. With f(x) = x° —6x + 3, we find that f(0) = 3 
and f(1)=—2. Thus this equation must have a solution between 0 and 1, and we begin 
with x, = 0 as our first estimate. Since f’(x) = 5x4 —6, the Newton-Raphson method 


yields the following successive estimates (correct to four decimal places) for the solution: 


0°-—6-04+3 

x =0- ey aaa = 0.5, 

5-04-6 

0.59 —6-0.54+3 
x, =0.5- See) ie = 0.5055, 

5-(0.5)'-6 
0.5055)° —6- 0.5055 +3 

x4 = 0.5055 — a il eat = 0.5055. 


5 - (0.5055)! —6 


Once two successive estimates are equal to each other, there will, of course, be no further 
improvement in the estimates unless the number of decimal places is increased. It is easily 
verified that 

(0.5055)° — 6(0.5055) + 3 = 0.000006981... 


and that the solution, correct to six decimal places, is 0.550501. 

This is not the book in which to either discuss the subtleties or attempt a rigorous 
proof of this marvelous technique. The underlying idea, though, is surprisingly simple. 
Suppose we have arrived at the estimate of x, for the solution r of the equation f(x) =0 
(Figure 3.1). If P =(x,, f(x,)) is the point of the graph of y = f(x) that lies directly 
above the x-axis point (x,,0), then the equation of the line through P and tangent to 


this graph is given by the point-slope formula as 


I~ F(x,) = f(x, Me — x,)- (3.12) 


EXERCISES 53 


Figure 3.1. The Newton-Raphson method 


Note that the diagram suggests that the x-intercept Q of the tangent line at P lies closer 
to our goal of R=(r,0) than P. We therefore choose the abscissa of this intersection as 
the next estimate x,,,. Since Q is the x-intercept of the tangent line at P, the value of 


X,41 is easily obtained by substituting (x,,, ), 


0) for (x, y) in Equation 3.12. This yields 


0-— f(x,) = f(x, \%par —*%,). 


When this equation is solved for x,,,, we obtain the Newton-Raphson recursion of 
Equation 3.11. 

The Newton-Raphson method we presented here is not perfect. There are cases wherein 
this method will miss some rather obvious answers. Nevertheless, when this method does 


locate a root, the root will be correct. 
Exercises 3.3 


Using the Newton-Raphson method, find at least one real root of each of the equations 


in Exercises 3.3.1 to 3.3.8 (use four decimal places). 


1 x? -—3x74+18x+12=0 4. x44x3—x?-—4x—13=0 
2. 2x3 -—5x?+x%-10=0 5. x? —30x+17=0 
3. x4-—100x -—85=0 6. x7 +x?—3=0 


7 sinx=x+1 (Set f(x)=sinx—x—1.) 


8. Inx =sinx 


54 SOLUTIONS OF EQUATIONS 


Using the Newton-Raphson method (to four decimal places), estimate the real roots in 
Exercises 3.3.9 to 3.3.12. 


9. V10 (Solve the equation x? —10=0.) 
q 


1o. 7100 11. 10000 12. 754,321 


For each of the equations in Exercises 3.3.13 to 3.3.18, explain why it does or does not 
have real solutions. 

13. x7+x°—-3=0 16. x!©— 4x8 +5=0 

14. x9 +101x° +93x°+441x44+1=0 17. Inx =tanx 


15. x8—2x441=0 18. x =Inx 


19. Let a, 6, and c be real numbers. Prove that if a” < 36, then the equation x? + 


ax? +bx+c=0 has exactly one real root. 


Chapter Summary 


We began by showing that every cubic equation is solvable by radicals. While the issue 
of solvability of equations by radicals eventually led to the creation of modern algebra, 
it was pointed out in this chapter that there are other, no less significant, aspects to the 
solvability of equations. The question of the existence of roots can be treated without 
regard to explicit derivations. Thus, it is known that every equation with either complex 
or real coefficients has at least one, possibly complex, solution. Ad hoc arguments can be 
given for the existence of such roots that provide no information about its value. Finally, 
the Newton-Raphson method was informally discussed to show how roots can be found 


without the use of radicals. 
Chapter Review Exercises 


Mark the following true or false. 
1. Every cubic equation has three distinct roots. 
2. The equation x? + ax? + bx +c =0 has at least one real root. 
3- Every cubic equation has at least one imaginary solution. 
4- Every equation can be solved by the Newton-Raphson method. 


5- Every root of the equation x*? — 1 = 0 has an algebraic expression in the integers. 


CHAPTER SUMMARY 


New Terms 

cyclotomic equation, 50 Newton-Raphson method, 51 
fifth-degree equation, 49 quartic equation, 49 
fourth-degree equation, 49 quintic equation, 49 


Fundamental Theorem of Algebra, 51 


Supplementary Exercises 


1. Implement the cubic formula on a computer. 


2. Implement the Newton-Raphson method on a calculator or a computer. 


55 


Chapter 4 


MODULAR ARITHMETIC 


T HIS CHAPTER introduces some new number systems that are suggested by the prop- 
erties of the exponents of the complex roots of unity. These number systems bear 
striking similarities to the more traditional rational and real numbers but also differ from 


them in crucial ways. 


4.1 Modular Addition, Subtraction, and Multiplication 


One of the steps in Gauss’s proof of the ruler-and-compass constructibility of the regular 


17-sided polygon called for the verification of the identity 
(+0! \(¢8 +74) ae ¢i4 +25 aan? +070 = git +29 +¢2 +0. 


In his writings Gauss used an abbreviation that replaced each ¢* with the symbol [A]. 
Since ¢*+17 = ¢*, it follows that, in Gauss’s notation, [& + 17] = [4] for each integer &. 
This is an example of modular arithmetic. For any positive integer 7, the two integers a 


and 6 are said to be congruent modulo n, and we write 
a=b (mod n), 


whenever 7 is a divisor of a— 6. This, by Corollary 2.17, is tantamount to saying that 
¢*=¢° where & is any primitive -th root of 1. Thus, 7 =3 (mod 4), 2 = 14 (mod 6), 
and —3 = 35 (mod 19). Note that if a= a’ (mod n) and 6 = 6’ (mod n), and if ¢ is as 
above, then 
yore = ree? pat ara = pete 

and 

PSE raer aie Sete act 
Introductory Modern Algebra, Second Edition. 57 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


58 MODULAR ARITHMETIC 


Table 4.2 Arithmetic modulo 5 


and hence, a+ 6 =a’ + 6’ (mod n) and ab =a’ b’ (mod n). 

It follows from this that when performing arithmetic modulo 7, it suffices to consider 
the applications of the arithmetic operations to the integers 0, 1,2,...,2—1 alone. Ta- 
bles 4.1 to 4.4 contain such abbreviated addition and multiplication tables for n = 4, 5, 
6, and 7. Motivated by the standard denotation of the set of integers by Z, arithmetic 
modulo x, when restricted to the set {0,1,...,2—1}, is denoted by Z,. An alternate, 
and more formal, definition of modular arithmetic is offered in Section 10.1 following 
Corollary 10.7. 

The commutativity, associativity, and distributivity of modular addition and multi- 
plication are consequences of the following identities: ¢¢+? = g°+#, glatb)te = gatlote), 
ab = ba rlabye — ralbe) and calbte) — yabtae Tris also clear that 2-0 =0-a=0 (mod 7) 
and a-1=1-a=a (mod 2) since [7° = 24 = 2° and 7! =7!4 = 2%, 

Thus, the operations of addition and multiplication possess the same desirable proper- 
ties relative to modular equivalence that they have when applied to the integers in the 


context of conventional equality. 


59 


MODULAR ADDITION, SUBTRACTION, AND MULTIPLICATION 


Arithmetic modulo 6 


Table 4.3 


Oo 
oO 
Oo 
oO 
oO 
oO 
Oo 
i) 


Table 4.4 Arithmetic modulo 7 


60 MODULAR ARITHMETIC 


If a is any nonzero element of Z,, then » —a is also in Z, and 


and so n—a is the additive inverse —a of a in Z,. Thus, 3 and 4 are each other’s additive 
inverses in Z,, and 3 and 5 are each other's additive inverses in Z,. The additive inverse 
of 0 is, by definition, itself. The guaranteed existence of these additive inverses makes it 
possible to define the difference a— 6 of two elements of Z,, as the element a + (—4). 
Thus, in Z,;, 5-7 =5+6=11 and 5—2=5+11=3. A moment’ reflection will lead 
to the conclusion that whenever a and 6 are integers such that 0 < 6<a<n-—1, then 
a— 6 is unambiguous, regardless of whether a and 6 are considered as conventional 
integers or as elements of Z,,. 

The issues of the existence of multiplicative inverses and the possibility of division in 
Z,, are more subtle. There is no integer x such that 2x = 1 (mod 4) since in the (Z,,-) 
multiplication table (Table 4.1) the row corresponding to 2 does not contain the entry 
1. On the other hand, in the (Z,,-) multiplication table (Table 4.2) all the rows but the 
first contain a 1, implying that every nonzero element of Z, has a multiplicative inverse 
in Z;. Thus, 2-3=1 (mod 5) and 4-4=1 (mod 5), meaning that 2 and 3 are each 
other’s multiplicative inverses in Z; whereas 4 is its own multiplicative inverse, a not so 
surprising fact, since 4=—1 (mod 5). 

A glance at the multiplication tables for Z, and Z, makes it clear that the situation 
here is entirely analogous to the above. The integer 2 has no multiplicative inverse in Z, 
whereas every nonzero element of Z, does have such an inverse. It will be seen in the 
next section that Z,, possesses all the requisite multiplicative inverses if and only if 7 is a 
prime number. 

A word of caution is in order here. While it is true that 7 = 2 (mod 5), it does not 
follow from this that x” = x* (mod 5) since 2” = 128 = 4 = 2? (mod 5). 

Even very complicated looking equations can be easily solved in modular arithmetic 


by the naive method of substitution. In Z,, the solution set of the equation 
6 4 = 
x° + 3x7 +2x+4=0 


is { 1,2 } because these are the only elements of Z, whose substitution into the polynomial 


x° + 3x4 42x +44 yields 0 (mod 5). 


EXERCISES 61 
Exercises 4.1 


Solve the equations in Exercises 4.1.1 to 4.1.5 in Z,. 
1. x+1=0 2. x?+1=0 3. x +x=0 
4. OP +x74+x4+1=0 5. x7 445 4x73=0 


6. Solve the equations of Exercises 4.1.1 to 4.1.5 in Z3. 


7. Solve the equations of Exercises 4.1.1 to 4.1.5 in Z,. 
Solve the equations in Exercises 4.1.8 to 4.1.15 in Z,. 


8. 3x4+2=1 10. x°+x7+4x+1=0 


9. 5x7+2x—-1=0 ir. 4x—3y =1 and 2x+4y =3 


12, 4x+3y=1 and x+6y=3 
13. x+2y+3z22=1, 3x+y+22=2, and 2x+3y+z2=3 


14. 4123456 4 ¥45=0 15. 94321 4 945432 4 344150 


16. Solve the equations in Exercises 4.1.11 to 4.1.13 in Z,. 

17. Solve the equations in Exercises 4.1.11 and 4.1.12 in Z,3. 

18. Solve the equations in Exercises 4.1.14 and 4.1.15 in Z,. 

19. Solve the equations in Exercises 4.1.14 and 4.1.15 in Z,;. 

20. Evaluate the (modulo 2) sum of all the elements of Z,, for each positive integer 7. 
21. Evaluate 1+3+5+---+1,001 in Z, o9). 

22. Evaluate 1+4+7+---+ 1,234 in Z,,,. 

23. Evaluate 1+2+4+---+2% in Zia 

24. Evaluate 1+3+9+---+3” in Z,. 


25. Prove that an integer is divisible by 3 if and only if the sum of its digits in base 10 
notation is divisible by 3. 

26. Prove that an integer is divisible by 9 if and only if the sum of its digits in base 10 
notation is divisible by 9. 

27. Let & be the first 2-th root of unity, and let 7 =* be any other n-th root of 


unity. Prove that there exists an integer m such that f = 7” if and only if & has a 


multiplicative inverse in Z,. 


62 MODULAR ARITHMETIC 


4.2. The Euclidean Algorithm and Modular Inverses 


It turns out that the best way to deal with the question of invertible elements in Z, 
involves the Euclidean algorithm for finding the greatest common divisor of two integers. 
This is a problem that was considered by many of the earliest mathematicians, including 
Euclid. An integer that is a divisor of both the integers m and 7 is said to be a common 
divisor. If such a common divisor of m and n has the additional property that it is 
divisible by all the common divisors of m and n, then it is the greatest common divisor 
(Gcp) or highest common factor (cr) of m and n and it is denoted by (m, 2). Thus, +1, 
+2, +3, +4, +6, and +12 are all the common factors of 24 and 36, but their HcF is 12. 
Thus, 
12 = (24, 36) = (—24, 36) = (—24, —36). 


Note that since every integer divides 0, it follows that (0, 0) does not exist. In Propositions 
1 and 2 of Book vii of The Elements Euclid suggests the following method for finding 
the greatest common divisor of the two positive integers m >. Suppose first that 7 is 
a divisor of m. Then it is clear that (m, ) =n. If n is nota divisor of m, then n and 
m—n ate positive integers such that (m— n,n) =(m, n). The reason for this is that every 
common divisor of m and 7 is clearly also a common divisor of m—n and n, and, vice 
versa, every common divisor of m—n and 7 is also a divisor of m =(m—n)+n and n. 
In other words, the set of common divisors of the pair { 7 — 1, 7} is identical with the 
set of common divisors of the pair { m,n}. Consequently, the two pairs also have the 
same greatest common divisor. 


This leads to the following derivation of the greatest common divisor of 481 and 74: 
(481, 74) = (407, 74) = (333, 74) = (259, 74) = (185, 74) = (111, 74) = (37, 74) = 37. 


It is clear that this procedure will always yield the greatest common divisor in a finite 
number of steps, and hence it does deserve to be called an algorithm. It is also clear that 
the repeated subtractions that lead from (481, 74) to (37,74) could be abbreviated by 
observing that 37 is the remainder left by 481 when divided by 74. Thus, if p is greater 
than q and if p leaves remainder r when divided by g, we have (p,q) =(7,q). This 
leads to a much faster algorithm for finding greatest common divisors. 

To find the greatest common divisor of 2,227 and 12,707, we note that the remain- 
der left by 12,707 upon division by 2,227 is 1,572 and hence (12,707, 2,227) equals 


THE EUCLIDEAN ALGORITHM AND MODULAR INVERSES 63 
(2,227, 1,572). Several applications of this reduction yield 
(12,707, 2,227) = (1,572, 2,227) = (1,572, 655) = (262, 655) = (262, 131) = 131. 


This highly efficient procedure for determining the greatest common divisor of two 
integers, commonly known as the Euclidean algorithm, has a very surprising range of 
nonobvious applications. Here, of course, we are interested in the question of which 
integers possess multiplicative inverses in modulo » arithmetic. Two numbers are said to 
be relatively prime whenever their greatest common divisor is 1. It is exactly those integers 
that are relatively prime to 7 that turn out to possess multiplicative inverses modulo 7. 


The following proposition paves the way to proving this fact. 


Proposition 4.1 Let m and n be two integers which are not both 0. Then g =(m, n) 


exists, as do two integers A and B such that (m, n) = Am+ Bn. 


Proof. It suffices to prove this proposition when m and n are both positive (Exer- 
cise 4.2.25). Thus, we assume that m > 7 > 0 and proceed by induction on the number 
of steps (divisions) in the application of the Euclidean algorithm to m and n. 

If the pair (7, 7) is such that exactly one division is required by the Euclidean algorithm, 
that must be because m is a multiple of 2, say m= dn for some integer d. In that case 
g=nandso g =0-m+1-7 is the required expression. 

Assume next that m > n area fixed pair of positive integers the derivation of whose 
greatest common divisor requires & > 0 divisions, and that the proposition holds for all 
positive integers x and y the derivation of whose greatest common divisor requires & — 1 
divisions. Let g > 0 and r < x be the respective quotient and remainder of m when 
divided by 7 so that m = qn+r. By the abbreviated Euclidean algorithm, (m, )=(r, 7). 
However, the derivation of (7, 2) clearly requires one division less than the derivation of 
(m,n). In other words, the derivation of (7, 7) requires only & —1 divisions, and so, by 


the induction hypothesis, there exist integers A’ and B’ such that 
g =(m,n)=(n,r)=A nt B'r. 
However, by the definition of q and r, m= qn+r so that 


g=Ant Br =An+ B(m—qn)=B m+(A —B’'q)n. (4.2) 


64 MODULAR ARITHMETIC 


Hence, 


A=B' and B=A'—B’q (4.3) 
are the required integers. 7 
Corollary 4.4 If m is relatively prime to 7, then m has a multiplicative inverse in Z,. 


Proof, Let m and n be relatively prime, so that (m,) = 1. By Proposition 4.1 there 
exist integers A and B such that Am+ Bn =1. Since Bn =0 (mod 2), it follows that 


Am = 1 (mod n), so that A is the multiplicative inverse of m in Z,,. . 


Since 


1-1=3-3=5-5=7-7=1 (mod 8), 


it follows that 1, 3, 5, and 7, which are all relatively prime to 8, are their own multi- 


plicative inverses in Z,. Similarly, since 
1-1=2-5=4-7=8-8=1 (mod 9), 


it follows that the multiplicative inverses of 1, 2, 4, 5, 7, and 8 in Z, are 1, 5, 7, 2, 4, 
and 8, respectively. 

The converse of Corollary 4.4 also holds (Exercise 4.2.26). Whenever 7 is a prime 
number, each positive integer less than 7 is necessarily relatively prime to 7, so that 
each of those has a multiplicative inverse in Z,. This fact is crucial to the subsequent 


development of this book and so we state it explicitly. 


Proposition 4.5 If p is a prime number and m is an integer, 0 < m < p, then m hasa 


multiplicative inverse in Z, . 
The list below indicates the multiplicative inverses of all the nonzero elements of Z,, 
Z,,Z,,and Z,: 
1-1=1 (mod 2), 
1-1=2-2=1 (mod 3), 


1-1=2-3=4-4=1 (mod 5), 


1-1=2-4=3-5=6-6=1 (mod7). 


THE EUCLIDEAN ALGORITHM AND MODULAR INVERSES 65 


Let p be a fixed prime number and let m be any integer that is not divisible by p. 


Denote the multiplicative inverse of m by m~'. We can then define 
; =ab" (mod p) 


whenever 6 # 0 (mod p). Accordingly, 3/4 = 3-2 = 6 (mod 7). Thus, from the point 
of view of the four arithmetical operations, modulo p arithmetic (when p is a prime) is 
just as well behaved as the arithmetic of the rational numbers, or the real numbers, or 
the complex numbers. This observation will be formalized in Section 6.1. 

The proof of Proposition 4.1 provides an effective method for finding a~' whenever it 
exists in Z,. Thus, to find the multiplicative inverse of 37 in Z,), we begin to answer 


this question by applying the Euclidean algorithm to 201 and 37. This gives 


(201, 37) = (37, 16) = (16,5) = (5, 1)=1, 


with 
201 =5-37+16, 
37 =2-16+5, 
16=3-5+41, 
5=5-1 
Now, 
1=0-54+1-1, 


and when A’ =0, B’=1, m= 16, n=5, and g =3 are substituted in Equations 4.2 


and 4.3, we get A= 1 and B =—3, so that 
1=1-16+(-3)-5. 


Again, when A’ = 1, B’ =—3, m=37, n= 16, q =2 are substituted in Equations 4.2 
and 4.3, we get A= —3 and B =7 so that 


1 =(—3)-37+7-16. 


66 MODULAR ARITHMETIC 


Finally, when A’ = —3, B’ =7, m= 201, n=37, and g =5 are substituted in Equa- 
tions 4.2 and 4.3, we get A=7 and B = —38 so that 


1=7-201 + (—38)-37. 


This means that —38 = 163 is the multiplicative inverse of 37 in Z49,. 

This section’s final propositions state some basic number theoretic facts that may seem 
rather obvious. It is greatly to Euclid’s credit that he realized that these facts actually 
called for proofs. The proofs we offer boil down to a clever application of the Euclidean 
algorithm and are essentially the same as those which appear in Euclid’s The Elements. 


The content of these propositions will be assumed by several subsequent proofs. 


Lemma 4.6 Let &, m, and 7 be integers such that & is a divisor of the product mn. If 
& is relatively prime to m, then & is a divisor of n. If & is prime, then it divides either 


m or n (or both). 


Proof. Suppose & is relatively prime to m. By Proposition 4.1 there exist integers A and 
B such that 1 = Am+ Bk, and so n= Amn+ Ben. Since k is a divisor of both Amn 


and Bkn, it follows that & divides their sum 7. ] 


Corollary 4.7 Let m and x be relatively prime integers. If m and 7 are divisors of k, 


then so is mn a divisor of k. 


Proof: Suppose k = k, m. Since n is a divisor of k = k, m and 7 is relatively prime to m, 
it follows from Lemma 4.6 that 7 is a divisor of k,. If k, = k,n, then k= k,m=k,nm, 


and so mn is a divisor of &. 2 


Let m,,m,...,m,, be integers which are not all zero.’ If d is a divisor of each m,, 
then d is a common divisor of the set { m,, m,,...,m,, }. If in addition, d is divisible 
by every common divisor of this set, then d is that set’s greatest common divisor (GcD) 


or highest common factor (HCF). 


Theorem 4.8 Let &, m,, m,,...,m,, be integers not all of which are zero. Then there 


exist integers x,,x>,...,x, such that 
GCD(m,, m,,...,M,) =X, mM, + xX)mM, +++: +x, m,,. 


"The rest of this section consists of optional material which will not be used until well into Chapter 13. 


EXERCISES 67 
Proof. Let S be the set of all the integer combinations of the form 
XM, + XM, +++ +X, mM, 


where each x, ¢ Z. The choice of x, = x, =--- = x,, = 0 demonstrates that, at the very 
least, 0 is an element of S. If not all the m,’s are zero, then the set contains a nonzero 
element, say 7. If 7 is negative, then —r is a positive element of S and hence it may be 
assumed that 7 is a positive number in S. 

Let S* denote the set of all the positive integers of 5. Since S is known to be nonempty, 
it follows from version 5 of the Principle of Mathematical Induction (Appendix F) that 
S* has a minimum element, say s. Next we show, by contradiction, that S is the set 
of all the multiples of s. Suppose ¢ is an element of S that is not a multiple of s. By 
the division algorithm, there exist numbers g and r such thatO<r<sandt=qs+r. 
But then r = t —qs € S, contradicting the fact that r #5 since 0 < r <s. Thus, every 
element of S is a multiple of s. Since s is a linear combination of the m,’s, so is every 
multiple of s. Thus, the set of multiples of s is equal to the set of integer combinations 
of the m,’s. 

Since the elements of S are, by their definition, integer combinations of the m,’s, it 
follows that s is the smallest positive integer that can be expressed as an integer combina- 


. > . es * 
tion of the m,’s: say s = x,m, +x,m,+---+x,m,,. Since g divides each m,, it follows 


n 
g | a m; 
i=l 


or g|s. Because s is a minimum integer combination of mM, ™My,...,m,,, it follows that 


that either 


5s < g. Hence, since s is positive, 


& HS HX mM, FxX,M, +++ +X, M,. a 


Exercises 4.2 


Find the greatest common divisor of the pairs of integers in Exercises 4.2.1 to 4.2.6. 


I. OQ and 365 4- 3,367 and 4,277 
2. land 3,600 5- 123,456 and 862,091 
3. 36 and 48 6. 14,540,165 and 85,050,243 


68 


MODULAR ARITHMETIC 


Find the multiplicative inverses of the elements of Z,, in Exercises 4.2.7 to 4.2.11. 


7. 


25 8. 66 9. 41 10. 37 11. 2 


Find the multiplicative inverses of the elements of Z,, in Exercises 4.2.12 to 4.2.16. 


I2. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26, 


27. 


28. 


29. 


30. 


25 13. 72 14. 41 15. 33 16. 2 


Find the multiplicative inverse of 4,096 in Z,. <5. 

Find the multiplicative inverse of 1,000 in Z,, <3,. 

Find the multiplicative inverse of each of the invertible elements of Z,,. 

Find the multiplicative inverse of each of the invertible elements of Z,,. 

Does the equation 399x + 703y = 114 have an integer solution in x and y? If so, 
find one; if not explain why not. (Hint: use Proposition 4.1.) 

Does the equation 399x + 703 y = 115 have an integer solution in x and y? If so, 
find one; if not, explain why not. (Hint: use Proposition 4.1.) 

Suppose m and » are integers; characterize those integers & for which the equation 
mx + ny =k has integer solutions in x and y. Prove your answer. 

If g =(m,7), where m and n are integers, show that g is the smallest positive 
integer that can be expressed in the form Am + Bn where A and B vary over all 
integers. 

Explain why it suffices to prove Proposition 4.1 for positive integers. 

Prove that m has a multiplicative inverse in Z,, if and only if it is relatively prime 
to n. 

Prove that if m and n are two integers, then (7, 7) is divisible by every common 
divisor of m and n. 

Let m and x be relatively prime positive integers. Prove that there exists an integer 
k, such that for any integer & > k,, k = Am+ Bn for some positive integers A and 
B. 

Prove that if p is a prime and a and & are any integers such that a” = 6” (mod p), 
then a= +6 (mod p). Is this also true when p is not a prime? 


Prove that 1-2-3---(g—1)=—1 (mod p) whenever p is a prime. 


31. 


32. 


33- 


34. 
35. 


36. 
37- 


38. 


EXERCISES 69 


The smallest positive multiple of both the integers & and m is called their least 
common multiple. 

(a) Prove that every common multiple of two integers is divisible by their least 
common multiple. 

(b) Prove that the least common multiple of any two positive integers & and m is 
km/(k, m). 

Let m,, m,,...,m, denote r positive integers that are pairwise relatively prime, and 
let a,,4,,...,a, denote any r integers. Then the congruences x =a, (mod m,), 
i=1,2,...,7, have common solutions. Any two solutions are congruent modulo 


m,m,-+-m,,. (This is known as the Chinese Remainder Theorem.) 


Suppose a/6 is a rational zero of the equation 
n n-1 = 
Ax” +a,x" ++ +4, x ta, =0, 


where a and 6 are relatively prime integers and dp, a,,...,@,, are arbitrary integers. 


Prove that a is a divisor of a, and that @ is a divisor of ay. 
Prove that the equation 3x? — 3x? + 17x —4=0 has no rational roots. 


Let a and n be any positive integers such that a is not an n-th power of any integer. 


Prove that 4/a is not a rational number. 
Let a and 6 be nonzero integers and g =(a, b). Prove that (a/g,6/g¢)=1. 


Let a, 6, and c be any integers such that g = (a, 6) is a divisor of c. Prove that if 
Xq» Yq ate any integers such that ax, + by) = (a, 6), then the complete solution set 


of the equation ax + by =c is 
{(x.y) | x =(c/g)xy+(b/g)t, y=(c/g)y¥9—(a/g)t te z}. 


Prove that if p is a prime and a, 8 < /1 and a# 1, then there exists an integer m 


such that @” = p. 


70 MODULAR ARITHMETIC 


4.3 Radicals in Modular Arithmetic 


It might be of interest to consider briefly the issue of radicals in the context of modular 
arithmetic. In analogy with the more conventional number systems, if a is in Z,, then /a 
is the set of all the elements x of Z, such that x? =a (mod n). Thus, in Zs; V0= {0}, 
V1= {1,4} and V4 = {2,3}, whereas 2 and 3 have no square roots in Z,. Similarly, 
inZ, J1={1,6}, V2={3,4}, and V4 = {2,5}, whereas 3, 5, and 6 have no square 


roots in Z,. It is interesting to note that in Z, 
V—-1=V4={2,3} (mod 5) 


whereas /—1 = V6 does not exist modulo 7. 

The answer to the question of just which elements of Z,, p prime, do possess square 
roots in Z, is the subject of the Law of Quadratic Reciprocity. This theorem, conjectured 
by both Euler and Legendre and first proved by Gauss, is one of the most important 
theorems of number theory. It is discussed in detail in Chapter 12. 

Higher order radicals are defined in a manner similar to that in which the square roots 


were defined. The question of existence here is much less understood, though. 
Exercises 4.3 

Determine which of the elements of Z, have square roots in Z, for the values of 7 in 
Exercises 4.3.1 to 4.3.4. 

1 7 2. 10 3. 13 4 17 
Determine which of the elements of Z, have cube roots in Z, for the values of 7 in 
Exercises 4.3.5 to 4.3.8. 

5. 7 6. ll 7 13 8. 17 


9. Let p be any prime integer, and let a and 6 be nonzero elements of Z,. Show 
that 26 has a square root in Z, if and only if either both a and 6 have such square 


roots or both a and @ fail to have such square roots. 


THE FUNDAMENTAL THEOREM OF ARITHMETIC 71 


4.4 The Fundamental Theorem of Arithmetic 


The Euclidean algorithm of Section 4.2 provides us with the means for proving the 
Fundamental Theorem of Arithmetic which states that every positive integer can be 


factored into the product of prime numbers in an essentially unique way. 


Theorem 4.9 Let 7 bea positive integer that is greater than 1. Then there exist prime 


numbers p, < p, < +++ < p, and positive integers 7,, 7,,..., 7, such that 


Seg Ag " 
n= p, Pr Pay Fe 


Moreover, if g, < g, < +++ <q, is another list of primes and s,,5,,...,5, is another list 


of positive integers such that 
n= 9) Gy Ie 
then =k, p, =g,;, and r;,=5, for i=1,2,...,6. 


Proof. We first prove the existence of such a factorization into primes by induction on 7. 
If n = 2, we can clearly use 4 = 1, and p, = 2. Let be any positive integer such that 
every smaller integer that exceeds 1 has a factorization into primes. If 7 is a prime, then 
we can again set / = 1 and p, = n to get a trivial factorization of » into primes. If 7 is 
not prime, say 2 = n,n, for some positive integers 7, and 7, both bigger than 1, then, 
by the induction hypothesis, 2, and 2, both have prime factorizations, and the product 
of these two factorizations yields a factorization of n into primes. 

The uniqueness of the prime factorizations is also demonstrated by induction. If x is 
prime, it cannot be expressed as the product of other primes, and so 2 = 7 is the only 
prime factorization of 2. Let 2 be a positive integer such that every smaller integer that 


exceeds 1 has a unique prime factorization. Suppose now that 


_— p1phn % _ 5 5 5, 
n= pi' Py “DP, = 4)'97 °° 9, 


A repeated application of Lemma 4.6 leads us to the conclusion that p, = 4; > 4, for 
some 7 = 1,2,...,%. Asymmetrical argument allows us to conclude that, in fact, p, = q,. 


Consequently, 


5, 


n-lo xy, r, 5,1 s. 
[P= Py Py Py =H TOE 


The theorem now follows from an application of the induction hypothesis (of the unique- 


ness of factorization) to the smaller number 7/p,. a 


72 


MODULAR ARITHMETIC 


Exercises 4.4 


Find the prime factorization of the numbers in Exercises 4.4.1 to 4.4.5. 


I. 


2. 


II. 


12. 


1,000,000 cae ae | 5. 1,048,576 
2044 4. 53,357 


Show that the sum, difference, and product of any two elements of the set 
Z[V—5] = {a+ bv-5 | aand 6 are real integers } 


is also in that set. 

For any element z =a + 6 V—5 of the set Z[/—5] above, define N(z) =a? +567. 
(a) Prove that N(z) = 1 if and only if z =+1. 

(b) Prove that N(z) #3 forall z € Z[V—5]. 

(c) Find all the solutions of N(z) =9 in Z[V—5]. 

(d) Prove that N(zw) = N(z)N(w). 

A nonzero element p of the set Z[/—5] is said to be prime if it is not £1, and if 
whenever p = zw for some z, w € Z[/—5] we may conclude that either z = +1 
or w =+1. Show that 

(a) 3, 2+ /—5, and 2— /—5 are all prime elements of Z[/—5], and 

(b) 9 can be factored into primes of Z[-/—5] in two different ways. 


Prove that there is an infinite number of prime integers. 


. Prove that every element of Z[/—5] is expressible as the product of a finite number 


of primes. 
Prove that there is an infinite number of primes in Z[¥—5]. 


A positive integer is said to be a perfect number if it equals the sum of its proper 
divisors. For example, 6 and 28 are perfect because 6 = 1+2+3 and 28=1+ 
2+4+7+414. Prove that if ” is an integer such that 2” —1 is a prime integer, 
then (2”—1)2”"! is a perfect number. Use this to find at least two more perfect 
numbers. (As of the writing of this text only 48 such perfect numbers have been 
found, the largest having 7 = 57,885,161.) 


CHAPTER SUMMARY 73 


13. A positive integer is said to be multiplicatively perfect if it equals the product of all 
of its proper divisors. For example, 6 and 10 are multiplicatively perfect since 6 = 
1-2-3 and 10=1-2-5. Find a simple characterization of all the multiplicatively 


perfect integers. 
Find the number of distinct positive divisors of the numbers in Exercises 4.4.14 to 4.4.16. 
14. 3554 15. 12)? 
16. re t; he a where p,, P>,.--» p, are distinct primes 


17. Express the greatest common divisor and least common multiple of any two integers 


in terms of their factorizations into prime powers, and then redo Exercise 4.2.31. 


18. Find an integer p such that p is a prime in Z but p+ 0- ¥—5 is not prime in 


Z[V—5]. 


Chapter Summary 


The modular number systems Z, (p prime) share many of the properties of the rational, 
real, and complex number systems. They are closed with respect to the four arithmetic 
operations, and their elements may or may not possess square roots, cube roots, etc. In 
particular the question of the existence of multiplicative inverses in modular arithmetic is 
resolved by an application of the well-known concept of the greatest common divisor of 
integers. As long as this tool was under discussion we went ahead and applied it to prove 


the unique factorization of integers. 
Chapter Review Exercises 


Mark the following statements true or false. 
x. 28=8? (mod 2°). 
2. 3 is a root of the congruence x? +2x + 1 =0 (mod 50). 
3. There exist integers x and y such that 25x + 137y =1. 
4-62 has a multiplicative inverse in Z,,. 
5- The multiplicative inverse of 63 in Z,o9 is 25. 
6. ‘The prime factorization of 2,730 is 2-3-5-91. 
Pe (ADS: 


74 


MODULAR ARITHMETIC 


New Terms 

Chinese Remainder Theorem, 69 Law of Quadratic Reciprocity, 70 
common divisor, 62 modular arithmetic, 57 
congruent modulo 2, 57 multiplicatively perfect, 73 
Euclidean algorithm, 63 perfect number, 72 

greatest common divisor, 62 relatively prime, 63 


highest common factor, 62 


Supplementary Exercises 


se 2s 


Write a computer script that will find the greatest common divisor of any two 


integers. 

Write a computer script that will find the multiplicative inverse of any element in 
Z, for any prime p. 

Write a computer script chat will solve any equation f(x) =0 in Z, for any prime 
Pp. 

Write a computer script that will solve any pair of simultaneous linear equations in 


two unknowns in Z, for any prime p. 

Write a computer script that will factor any integer into primes. 

Write a computer script that will list all the primes up to some given integer 7. 
Write a computer script that will list all the primes in Z[/—5]. 

If » is any integer, let Z[./n] ={a+ 6./n|a,b€Z}. Investigate the question of 


unique factorization in Z| ./7] for various specific values of 7. 


Compute the number of digits in the largest known perfect number 


Yl ad 0 ellal ss 1). 


Chapter 5 


THE BINOMIAL THEOREM AND 
MODULAR POWERS 


al Res WELL-KNOWN Binomial Theorem is proved in this chapter. In addition to its 
intrinsic interest, this theorem also leads to a simple proof of Fermat's Theorem 


which, in turn, is useful for evaluating powers in arithmetic modulo a prime. 


5.1 The Binomial Theorem 


The formula (a + 6)* = a? + 2a6 + 6? is, of course, standard fare in high school algebra. 
We are concerned here with its generalization to higher exponents. The search for this 
generalization begins with the successive expansions of the third and fourth powers of 


the binomial (a + 6). These are easily obtained recursively as follows: 


(a+ bp =(a+ bP (at 6) =(a? +2ab+ b*)\(at b) 
=a? +2a7b+ab?+07b+2ab? + b° 
=a? +(24+1)(a7b)+(1+2)ab? + 67 


=a 4+3a7b+3ab?+ 63 


and 
(a+ 6) =(a+ bP (a+ 6) = (a2 +3a7b4+3ab> 4 b°\(a+b) 
=a'+3a°b+3a7b? +46? +a°b +3a2b? +3ab>+ b4 
=a*+(3+1)(a?b)+(3+3)a7b? +(14+3)67 + 54 
=a + 4a°b + 6a2b? +4263 + 54. 
Introductory Modern Algebra, Second Edition. 75 


By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


76 THE BINOMIAL THEOREM AND MODULAR POWERS 


These low-order examples suggest that the expansion of (a+ 6)” consists of the sum 


of terms of the form c¢, ae 


where 2 > & > 0, and where the coefficient c, , is a 
positive integer. Moreover, ¢, y = ¢, ,, = 1, and for each other k, ¢, , is the sum of two 
3 i gat : _ 
coefficients of the expansion of (a + 6)”~". More precisely, ¢, , = ¢,_1. 4+ p-1,¢-1° 
The above observations motivate the following inductive definition as well as the 
subsequent theorem. For any integers n > & > 0 the binomial coefficient cae which is 


the traditional way of writing c 2b? 1S defined by 


n\_[n = ( 

0} \nj~ 5-1) 
n\ | n—-1 n—-1 f pug 
LJ=l + Pa orn>k>0O. (5.2) 


whereas by Equation 5.2, 


Similarly, 


()-()-()- 
G)-G)-G)-ve-s (er) 
()=C)+()asee6 — ()=()e()ers3e 


Since (a + 6)° = 1 and (a+ 6)! = (a+ 6), these numbers are now easily seen to agree 


with the coefficients of the expansions of (a+ 6)” for 2 =0, 1, 2, 3,4. 


THE BINOMIAL THEOREM 77 


Theorem 5.3 (The Binomial Theorem) Let » be a nonnegative integer. Then 


yee aN Neal Re M\ nk pk a ’ bn 
(e+ 6y"= (9) (7) b+ (7) oF + +(*) : 


Proof. We proceed by induction on 2, the cases 2 = 0, 1, 2,3,4 having been verified 


above. Assuming the theorem for 7 = m, we expand 


(a+ b)"*!=(at+b)"(a+6) 


7 (3 Jer (Terta sans (@ ean ("on (a+b) 
0 1 k m 
vt (3 er of (‘ers ts (eran (7 ene ot asad 
0 1 pA k 
m m m 
+( Jeon Jena+( Jem tae 
m 0 1 
4 flame (lpm 
k-1 m 
(cent) [(a) (oer 
0 1 0 2 1 
m m m+1-k pk... ed m+1 
[)(2alerte ene) 
2 « ‘ee é ie ‘eno +("> ‘eB 
0 1 2 


This completes the induction step and the proof. . 


Equation 5.2 is known as Pascal’s Identity. It was visualized by Pascal in the form of 
a triangular array (Pascal’s Triangle, Figure 5.1) wherein each number is the sum of the 
two entries directly above it and to its left (whenever these entries exist). The entries 
on the (7 + 1)-th diagonal line (from bottom left to top right) are the coefficients that 
appear in the expansion of (a+ 6)”. These triangular arrays did not originate with Pascal. 
Figure 5.2 displays a reproduction of such an array that appeared over three centuries 


before his birth. 


78 THE BINOMIAL THEOREM AND MODULAR POWERS 


1 3 6 10 15 21 28 36 45 
4 10 20 35 56 84 120 
1 5 35 70 126 210 
1 6 21 56 126 252 
(a+b) 1 7 28 84 210 


1 8 36 120 


Figure 5.1 Pascal’s Triangle 


a 
# ‘ 

we Py S 
ark \ 
An RA | 
Ao BS 8 

ARR 

jo 88S 


Eee Set (Oy . 


Figure 5.2 — Pascal’s Triangle as depicted in a Chinese book in 1303 


THE BINOMIAL THEOREM 79 


It should be noted that the Binomial Theorem is valid for all the number systems 
we have studied so far. It is immaterial whether a and @ are to be interpreted as real, 


complex, or modular numbers. Thus, 

(24+ i)? =234+3-27-143-2-7? +P =8412i-6—-i=241li. 
Similarly, in Z3, 

(a+ 2 =a°+3-a?-243-4-2?7 +2? =a3+0404+2=49 +2. 


In Z,, 
(a+ 6) = a4 +4a°b + 6076? +4ab> + b4 = at +207b? 4 01. 


For the purposes of the remainder of this section it may be assumed that the a and the 4 
of the Binomial Theorem are real numbers. 

While Pascal’s Identity provides an effective procedure for computing the binomial 
coefficients, it would be clearly handy to have a more direct method. To obtain such a 
formula, the @ of the Binomial Theorem is replaced by 1, and, since we are about to 


differentiate, the 6 is replaced by x. The Binomial Theorem then assumes the form 


(l+x)?=1+ (Tx (S)eens Qa teeet ae (5.4) 


When Equation 5.4 is differentiated & times with respect to x, we obtain 


n(n — 1)(n—2)---(n-—k+ 1)(1 +x)" 


= Hb) -2-1( De dhahdyst booted, gat (5.5) 


where d,,d,,...,d,_, are some integers whose exact values turn out to be immaterial. 


The substitution of x = 0 in Equation 5.5 yields 


nn Yor 2)-(— hee N= 1) 2-1(7) 


or 


({)- dies ee A a 65.) 


k Ke A)e2e1 


80 THE BINOMIAL THEOREM AND MODULAR POWERS 


Accordingly, for n = 6, 


(") 6 (:) 6-5 (5) 6-5-4 

ee A =—=15, = = 20, 

1 1 2) 2-1 Lis ame Vg | 

6 OAD sg 6 _6:5:4:3-2 6 JAD 201 - 
ro ame Oe ee aa ee Oe Oe 2 oe A Wy ee ne 


so that 


(a+ 6) =a° + 6a°b + 15a*b* + 20075? + 15a? b4 + 60° b + 6°. 


While the formula for (3) given in Equation 5.6 is convenient for numerical compu- 


tations, it is quite frequently better to use the equivalent expression 


()-mom (5.7) 


where k! = k(& —1)(& —2)--+2+1 for any positive integer & and 0! = 1 (Exercise 5.1.15). 
‘The binomial coefficients ( ;) are the subject of many surprising identities and we 
shall describe several methods for proving them. The most elementary approach uses 


Equation 5.6. Such is the case in the proof of the identity 


n _ nak n , 
kel) b41\ey (5-8) 


( n ) n(n—1)---[n—(k+1)+1](A4+1)] n(n —1)---(2-&) 


k+1 k(k—1)--+2-1 (k+1)k(R—1)---2-1 
_n-k n(n—lf(n—2)---(n—k+1) _ n—-kifn 
k+l k(R—1)-++2+1 ly) 


Equation 5.8 was put to very good use by Newton in his groundbreaking work on the 
extension of the Binomial Theorem to negative and fractional values of 7. 

As demonstrated by the proof of the Binomial Theorem, Equation 5.2 (Pascal’s Identity) 
is very useful as it lays the groundwork for many inductive proofs of other identities. 


However, many of these identities are subject to shorter proofs by another method, 


THE BINOMIAL THEOREM 81 


commonly called the method of generating functions. Consider, for example, the identity 


a= (t)a (Tae (enr ("J+ (%) 5.9) 


for n > 0. While this identity can be proven directly, though somewhat laboriously, by 
mathematical induction, it can also be verified by substituting a = 6 = 1 in the Binomial 
Theorem. Thus, this method calls for the recognition of some functional identity which, 
upon the replacement of the variable(s) by some cleverly chosen values, yields the required 
identity. In fact, the above derivation of Equation 5.6 is also an instance of the method of 
generating functions, as is the following. If both sides of Equation 5.4 are differentiated 


with respect to x, we obtain 


n(lt+x)"l= (") 09 eames toen( "lar 


The substitution of x = 1 now yields the identity 


sit =(7)+2(3)+--+4(2)+--+0(") 


Finally, there is another interpretation of the binomial coefficients that allows for a 
completely different approach to the whole topic. For this we need to reexamine the 


Binomial Theorem from another point of view. Each summand in the expansion of 
(a, +4,)(b, + 6, )(c, + e,)(d, +4,)--- 


has the form a, 6 ik d,--- where each of the indices i, 7, &, and / assumes the values 1 


or 2. When this observation is applied to the expression 
(+x)? =(14+x)14+x)14+x)(1+<x)--- 


it is clear that each of the summands of this expansion has the form X,X, ---X,, where 
each X, is either 1 or x. In particular, the general summand X,X, ---X,, is of degree 2 


in x if it has the form 


82 THE BINOMIAL THEOREM AND MODULAR POWERS 


The coefficient (5) clearly equals the number of such summands that appear in the 
expansion of (1 +x)”. Since the number of such summands equals the number of pairs 
of symbols X,,X; that can be designated for replacement by x, it follows that we now 
have obtained a formula for the number of ways a pair of distinct objects can be selected 
from a given set of n distinct objects, namely, this number is (3). 

Similarly, (3) equals the number of summands in the expansion of (1+ x)” that have 


degree 3 in x, i-e., it equals the number of summands of the form 


This number equals the number of distinct triples XX), X,, that can be selected from 
X,,X3,...,X,,. In other words, (3) equals the number of three-element sets that can be 


formed using the integers 1, 2,...,. This generalizes to the following statement. 


Proposition 5.10 For each pair of nonnegative integers & < n, the binomial coefh- 


cient equals the number 0 -element subsets that can be forme rom the inte ers 
") equals the number of &-el bsets that can be formed from the integ, 


For example, the number of three-person committees that can be selected from a group 


i 25-24-23 
= ——— _ = 2,300. 
3 3-2-1 


of 25 people is 


This point of view provides us with a new tool for proving some facts about the binomial 
coefficients. Let us again consider Identity 5.9. Its right-hand side now has the obvious 
interpretation of denoting the number of all the subsets of {1,2,...,}, classified by 
their cardinality. This number, however, also happens to be 2” as can be seen by the 
following argument. Choosing a subset of S of {1,2,...,} is tantamount to deciding 
for each & = 1,2,..., whether or not & belongs to S. Since n such decisions are to be 
made, and since each such decision can be settled in one of two ways (to belong or not 


to belong), it follows that there are 2” such subsets S. 
Exercises 5.1 


Expand the binomials in Exercises 5.1.1 to 5.1.6. 

i. (2x +3y?) 4. (3x?-yz3) in Z, 
2. (3x? -y23) 5. (z°+2)° in Z, 

3. (3x*-y2°) in Z, 6. (1—2/x) 


10. 


Il. 


12. 


13. 
14. 


EXERCISES 83 


Find the term containing 2”° in the expansion of (a — 467c*)30. 

Find the coefficient of x! in (x? +3/x)”. 

Find the coefficient of x!® in (2x4 —3x)?. 

Find the coefficients of x~4 and x~> in (x? —2/x7)!°, 

Prove that for m= 4n,4n—3,4n—6,...,—2n, the coefficient of x” in (x? + 


1/x)” is 
(2n)! 


4n—m 2n+m\\" 
ae 
Prove that for 1 < & <n, (.71)2(%) if k>(n+1)/2 and (471) < (Z) ifk< 
(2 +1)/2. 
Show that the middle coefficient(s) of the expansion of (1+ x)” is (are) the largest. 


Which power of x has the largest coefficient in the expansion of (2+ 3x)!7? 


Prove the identities in Exercises 5.1.15 to 5.1.19 for any two integers & and 7 such that 
O<k<n. 


15. 
17. 
18. 
19. 


20. 


21. 


22. 


23. 


24. 


(2) =(a!)/(At(n— &)!) 16. (7) = (274) 

(ME=()CH) @2r2e) 

mr )= (r+ Ui) + rr) = (ri) + (21) 

(3)(F) = (73? (a) + 2073 (hn) + (3 (7) = (2) + 27222) + C2) 
Prove that the coefficient of the middle term of (1 +x)?” equals the sum of the 
coefficients of the two middle terms of (1 + x)?”"!. 

Prove that 27 < (22) <4" forn>1. 


Prove that the product of any & consecutive positive integers is an integer multiple 


of &!. 


Prove that the number of different 0-1 strings that may be formed with p 0’s and 
q 1’s in which no two 1’s are consecutive is ae ): For example, if p = q = 3, 


these strings are 010101, 100101, 101001, and 101010. 


Prove that (2) is even for n> 0. 


Prove the identities in Exercises 5.1.25 to 5.1.36. 


25. 


26. 


(3)-(1)+ (3) +E") =0 (n> 0) 
(7)+(3)+(S)+---= 271) (n> 0) 


84 


27. 
28. 
29. 
30. 
31. 
32. 
33- 
34- 
35- 
36. 


THE BINOMIAL THEOREM AND MODULAR POWERS 


—2(3)+3(3)—--+(-1)"'n(2)=0 (n> 1) 

+3) FIG) tt (= 

aC eee er 

*+(t) (8) 4 #(2P =(2) 
5)+(4)—-(2) +--+ = 2”? cos(nz/4) 


ee ee ee ee eee eee eee 


aa) oz Or Oz oz i} oz ox es 


) 
) 
) 
) 
)- 
) 
) 
) 
) 


((3) + (FNC) + (2) (Cts) + (21 = (28) 


Simplify the expressions in Exercises 5.1.37 to 5.1.39. 


37- 
38. 
39. 
40. 


4i. 
42. 


43- 


44. 


45. 


(1) +22(8) +38) (8) 
(1)+2(2) +9%(3)4+ (8) 
(5)(8) + (19(8) + (298) 42-4 (ot (3) 


The Fibonacci numbers F,, are defined inductively as F, = F, = 1 and F,,, = 
F4, +, for n> 1. Prove that 


n+l 
n—-1 n—2 n—m 
Fo+iait+ + Feet 
7 ] 2 m 


for n > 0, where m = n/2 if n is even and m =(n—1)/2 if x is odd. 

Prove that if m is an odd integer, then (”;!) = 1 (mod m). 

Prove that if p is a prime and m is an integer such that m #0 (mod p), then 
(ae #0 (mod p) for all positive integers &. 

Prove that if & is any positive integer and p is a prime, then (77) =(-1) 
(mod p) for j =0,1,2,..., p*—1. 

‘The grid G,, , consists of the graphs of the lines x = 7 and y = j for i =0,1,..., m 
and j =0,1,..., with 0< x < mand 0< y <n (see Figure 5.3). Prove that G,, , 


contains aes } r oe ) rectangles. 
A path from (0, 0) to (m, 7) in the grid G_, , consists of a sequence of integer points 


(%0» Yo) (> I) (Xp Ia) +> man’ Iman) Such that (Xp Yo) = (00); (nan Imen) = 
(m, n), and, for k = 1,2,..., m+n, either (x,, y,)=(x,_, +L y¢_,) or (x 9) = 


FERMAT’S THEOREM AND MODULAR EXPONENTS 85 


(,_1> Jp) + 1). Prove that the number of distinct paths from (0, 0) to (m, 2) paths 


H H m+n 
in G,, , is ( i; 


Figure 5.3 ‘The grid G,, 


46. How many triangles are contained in Figure 5.4(a)? 


47- How many triangles are contained in Figure 5.4(b)? 


(a) (b) 


Figure 5.4 ‘Triangle counting problems 


48. Find an error in Figure 5.2. 


5.2 Fermat's Theorem and Modular Exponents 


We now return to modular arithmetic and address the issue of exponentiation. Consider 
the infinite sequence 
2,27, 23, 24, 2°, 26, 27, 28, 2°... (5.11) 


in Z,. Since Z, contains only five elements, it is clear that the actual values of these 
powers must display many repetitions. In fact, when these powers are evaluated, the 


sequence is transformed into 
2, 4, 3, 1, 2, 4, 3, 1,2,... (5.12) 


and so Sequence 5.11, and therefore also Sequence 5.12, will cycle indefinitely through 
the values 2, 4, 3, and 1. In general, if a ¢ Z,, the sequence 


3 4 


Be pees (5.13) 


86 THE BINOMIAL THEOREM AND MODULAR POWERS 


is bound to eventually cycle since its individual terms run through a finite number 


of values, and once a* = a” 


(mod 7), we must clearly have a*** = a”** (mod n) for 
s =0,1,2,.... What is not obvious is that when 7 is a prime p, the cycling begins 
immediately and length of this cycle is either p — 1 or some divisor thereof. We now set 
out to prove these interesting facts. 

As was observed in the previous section, the Binomial Theorem also holds in Z,,. 
Under certain circumstances, this modular Binomial Theorem assumes a very simple 
form. If p is a prime number and 0 < & < p, then the binomial coefficient (2) is an 


integer that equals 
Ppa Gary) 
k(k—1)++1 


Since p isa prime greater than k, p is relatively prime to the denominator of this fraction 
and so, as this fraction is known to cancel out to an integer, p must be a prime factor 
of this integer. In other words, (j= 0 (mod p) if 0< & < p. When this observation is 


applied to the expansion of (a + 6), we obtain the following curious fact. 


Proposition 5.14 If p is a prime integer and a, 6 € Z, , then 


(a+b)? =a? +b? (mod p). 


This proposition, in turn, has as its consequence one of the fundamental elementary and 
nonobvious theorems of number theory. It was first pointed out by Fermat as a tool in 
the search for perfect numbers (Exercise 4.4.12), but has since completely transcended 


that narrow context. 
Theorem 5.15 (Fermat's Theorem) _ If p is any prime integer and a is any integer, then 
a’ =a (mod p). 


Proof: We proceed by induction on a, the theorem being clearly true for a= 0. Assuming 


that the theorem holds for a = &, we note that 
(+1)? =k? +1? =k+1 (mod p). 


Hence, by induction, the theorem holds for all the nonnegative values of a. Since every 
negative integer is congruent to some positive integer modulo p, the theorem holds for 


all values of a. B 


Fermat’s Theorem guarantees that Sequence 5.13 starts cycling immediately, a phe- 


nomenon that need not happen in Z, when x is composite (Exercise 5.2.27). It still 


FERMAT'S THEOREM AND MODULAR EXPONENTS 87 


remains to demonstrate that the length of the cycle is a divisor of p — 1 and this will be 
done shortly. 

An interesting and typical consequence of this theorem is the fact that if 7 is relatively 
prime to 7, then n° — 1 is divisible by 7. For, by Fermat’s Theorem n’ = n (mod 7). 
Since 7 is relatively prime to 7, it follows that 2 #0 (mod 7) and so division by 7 yields 
n° = 1 (mod 7), which is tantamount to saying that 7 divides n° — 1. It is clear that the 


same holds for every other prime. 


Corollary 5.16 If p is any prime and a =0 (mod p), then 2?~' =1 (mod p). 


Corollary 5.16 can be interpreted as saying that every nonzero element of Z 1, (p prime) 
is a (p —1)-th root of unity modulo p. Such roots of unity can exist in Z,, for composite 
n as well. Thus, 34 =5*=7* =1 (mod 16). We shall henceforth understand the term 
root of unity to include both the complex ones and the appropriate elements of Z,, for 
all integers x. When the need arises, we shall refer to the latter as modular roots of unity. 

For any modular root of unity r in Z, we define o(7), the modular order of r in Z,, 
to be the least positive integer & such that r* = 1 (mod n). Corollary 5.16 guarantees 
that every nonzero element of Z,, p prime, has a finite order which is in fact no greater 


than p—1. Table 5.1 lists the orders of the nonzero elements of Z,. 


m|1 2 3 4 5 6 
o(m)}1 3 6 3 6 2 


Table 5.1 The elements of Z, and their orders 


The fact that all the orders of Table 5.1 are divisors of 6 = 7 —1 is no coincidence. 
Modular order enjoys the same properties as does the order of the complex roots of unity, 
and we restate these properties in this new context without proof. It is easily verified that 
the proofs of Proposition 2.16 and Corollary 2.17 carry over to this new context without 


any modifications whatsoever. 
Proposition 5.17 If r is any modular root of unity in Z, and & is any integer, then 


r* =1 (mod n) if and only if & is a multiple of o(r). 


Corollary 5.18 Suppose r is a root of unity and a and 6 are any two integers. Then 


r@ =r ifand only if o(r) is a divisor of a— 6, and 1, r, r?,..., 7%”! are all distinct. 


88 THE BINOMIAL THEOREM AND MODULAR POWERS 


Returning to Equation 5.13, it follows from Proposition 5.17 that o(a) is a divisor 
of p—1. As the length of the repeating segment of Equation 5.13 equals o(a), we may 
conclude that this length is a divisor of p — 1. 

An element of Z, is said to be a primitive root modulo p if its order in Z, is p—1. 
Thus, according to Table 5.1, 3 and 5 are the only primitive roots modulo 7. We shall 
later see (Theorem 7.17) that for every prime number p there exist primitive roots modulo 
p. The solutions of Exercises 5.2.14 and 5.2.16 indicate that this fact is far from obvious. 

We now go on to prove some more facts about orders of roots of unity. If f is any root 
of unity and & is any integer, then it stands to reason that the order of ¢* should depend 
on & and the order of £. Thus, if o(f) = 12, then the orders of 27, 2°,..., 1! are easily 
seen to be 6, 4, 3, 12, 2, 12, 3, 4, 6, 12, respectively. A little experimentation leads to the 


statement, if not the proof, of the following proposition. 


Proposition 5.19 If ¢ is a root of unity of order n, o(f*) = n/(k, ) for all integers &. 


Proof, Let g denote the greatest common divisor of & and x, and let n’ and k’ be 
integers such that 2 = gn’ and k = gk’. Since g =(&, 7), it follows that &’ and n’ are 


relatively prime. If m is any integer, then the following statements are all equivalent: 


= (¢'\m=1; = gn’ divides gk’ m; = x’ divides m. 
g g 


« n divides km; » n’ divides k’m; 


Hence 0(f*) , the least positive integer m for which (¢*)m = 1, is also the least positive 


integer that is divisible by n’, namely n’ itself. Thus 0(Z*) =n’ = ng = n/(k, n). . 


The relevance of common divisors to the orders of roots of unity is reinforced by the 


next proposition for which we will eventually find several useful applications. 


Proposition 5.20 If r and s are roots of unity (both complex or both modular) and if 


the orders of r and s are relatively prime, then o(rs) = o(7)o(s). 


Proof. Let R=0(r), S=0(s), and T =o(rs). Since (r5)*> =(r%)5(s°)* = 1, it follows 
that 7 isa divisor of RS. Conversely, 1 =(rs)75 = 775557 =r7%.1=77°. It therefore 
follows from Proposition 5.17 that R is a divisor of 7S. Since R and S are relatively 
prime, it follows from Lemma 4.6 that R is a divisor of T. A similar argument permits 
us to conclude that S is also a divisor of T. Since R and S are relatively prime, it now 
follows from Corollary 4.7 that RS is a divisor of T. Thus, we have shown that T and 


RS each divide the other and the proposition follows. 2 


EXERCISES 89 


Thus, if { = cos 22/12 + isin 2/12, then the elements ¢° and 2? of V1 have orders 
3 and 4, respectively, and the element ¢° = ¢8¢? has order 12 =3-4. Similarly, in 
Zy9> 0(18) = 2, 0(7) =3, and o(18-7) =0(12)=6. Exercise 5.2.22 implies that the 


requirement of relative primeness in the above proposition is indeed necessary. 


Exercises 5.2 


I. 


Re 


Evaluate x °° for each element x of Z,. 


Evaluate x?° for each element x of Zi: 


Solve the equations in Exercises 5.2.3 to 5.2.5 in Z,. 


3. 


ym WA 


Il. 


12. 


13. 
14. 
15. 
16. 
17. 
18. 


19. 


7,777 7,777 
x 7 x 6 


+x+5=0 4. +x+5=0 60 te 4 5 =0 


Solve the equation in Exercise 5.2.3 in Z,, . 
Solve the equation in Exercise 5.2.4 in Z,,. 
Solve the equation in Exercise 5.2.5 in Z,,. 


Prove that 27 — n is divisible by 42 for any integer 7. 


. Prove that 2! — 7 is divisible by 2,730 for every integer 7. 


Prove that n° — n is divisible by 30 for all integers n and by 240 for all odd 7. 


Prove that 7°°! = 2 (mod 561). Note that this example disproves the converse of 
Fermat's Theorem. Composite numbers with this property are called Carmichael 
numbers; 561 is the smallest Carmichael number, and it was only recently proved 


that there are infinitely many such numbers. 

For any prime p, if a? = 6? (mod p), show that a? = 6? (mod p’). 
Find the units digit of 24°. 

Find all the primitive roots modulo 11. 

Find all the primitive roots modulo 19. 

For each prime p < 20 find a number that is a primitive root mod p. 


Let p bea fixed prime and let o(z) denote the order of a in Z,. Prove that if a ae 
then 1+a+a?+---+a%-! = 0 (mod p). 


Let p bea fixed prime and let o(a) denote the order of a in Z,. Prove that 


a-a?+..a%-! = (-1)-! (mod p). 


90 THE BINOMIAL THEOREM AND MODULAR POWERS 


20. Let p be any prime number. Prove that (p — 1)! =—1 (mod p). 


21. Let p be any prime number. Prove that if there are primitive roots mod p, then 


the product of all of them is equivalent to 1 or 2 mod p. 


22. Prove that for any odd prime p there exist nonzero elements a and 6 of Z i such 
that o(ab) ¢ o(a)o(4). 


23. The Fibonacci number F,, is defined recursively as F, = F, =1 and F,=F,_,+ 


n 


peel 14+75\" (1-¥v5\" 
een oa CO a Oe a 


for all 7 = 1,2,... (this is known as Binet’s Formula). 


F,_, for n>2. 


(a) Prove that 


(b) Prove that if p is a prime distinct from 5, then iG =+1 (mod p). 


24. Show that the equation x” + y” = z” (mod 3) has nonzero solutions in Z, if and 


only if 7 is odd. 


25. Show that the equation x” + y” =z” (mod 5) has nonzero solutions in Z, if and 


only if 7 is odd. 


26. Forwhich positive integers 7 does the equation x” + y” =z” (mod 7) have nonzero 


solutions in Z,? 


27. Find integers a and n such that a¢ Z, but the Sequence 5.13 does not cycle 


immediately. 


5.3. The Multinomial Theorem 


Having obtained the Binomial Theorem, which describes the expansion of (a+ 6)”, it is 
natural to ask for analogous expressions for (a+ 6 +c)”, (a+6+c+d)”, etc. It turns 
out that these expressions are harder to describe than to derive. We begin by changing 
the variables to x,, x,,..., and observe that for any v such variables and for any positive 


n 


integer , the expansion of (x, +x, +---+.,)” consists of the sum of terms each of 


which has the form 


where k, +k, +---+h, =n and ¢ isa positive integer that depends on 7, ky, k,,...,,. 


THE MULTINOMIAL THEOREM 91 


Theorem 5.21 (The Multinomial Theorem) If # and v are any positive integers, then 


where K varies over all the v-tuples K =(h,,4,,...,4,) of nonnegative integers 


hiv ky vk, such that ky thy +-+-+k, = 7. 


Proof: Let n and v be fixed positive integers. As noted above, for each v-tuple K = 


(£,, ky,-.-»&,) of the above format there exists an integer c, such that 


kk k 
(x, +X t eee te, = > cya tags? ooo aey” (5.22) 
K 


Fix some such K =(k,,&,,...,4,) and differentiate both of the sides of Equation 5.22 


k, times with respect to x, for each i = 1,2,...,v. Since kh, +k, +++: +k, =n, 


o” 


FoR SEn oes SOE n= yl 
Sh hg oie (x, tx, +...+%,)" =al. 
1 2 v 


Moreover, for any v-tuple (7, m,,..., m,) that is different from (&,, &,,...,&,) and for 


which m, + m,+:+++m, =n, we must have m, < k; for some i so that 


a” (ace aaa {tet ok! ifm, = b, forall 4 
dx, dnx, eee ox, 1 2 % 0 


otherwise. 
Hence n! = cyk,!k,!-++k,! and 
n! 
Ke Bibl k 


i 
ye 


Accordingly, the coefficient of x4 y?z in the expansion of (x + y +z) is 814!3!1! = 
280. 


92 THE BINOMIAL THEOREM AND MODULAR POWERS 
Exercises 5.3 


Expand (x + y +2)” for the values of 7 specified in Exercises 5.3.1 to 5.3.3. 


i. -2 2 3 3. 4 


Expand the multinomials in Exercises 5.3.4 to 5.3.6. 
4. (x? +9? +2) 5. (x*+yptxy/) 6. (xy+yztexy 
7- Determine the coefficient of x4 y?z> in the expansion of (x + y+2z)!* . 
8. Find the coefficient of x? in the expansion of (1 + 2x —3x’) . 


9. Find the coefficient of x‘ in the expansion of (1—3x +2x*)° . 


10. Find the coefficient of x? in the expansion of (1 —2x + 3x? — 4x3)’. 


Suppose (1 +x +x7+4-++-+x*)” Hay tayxt+ayx?te-ta, xh, Prove Exercises 5.3.11 
tO 5.3.14. 

11. Show that a,+4,+4,+-+--+4,,=(k+1)”. 

12. Evaluate 2, + 2a, +3a,+---+kna,, . 


13. Show that when & = 2, 
a — 4; +43 — a5 +++ +(-1)""'a2_, =a,[1-(-1)"a,]/2. 
14. Show that when & = 2, 
a) +4, +a,t+--- =a, +agta,+++-=a,+a,+a,+---=3" 1, 


15. Show that when & = 3, 


-5()(22) 


16. Prove that the number of terms in the expansion of (x, +x, +---+.x,)” is ( eeect if 


THE EULER FUNCTION 93 


5.4 The Euler g-Function 


Corollary 4.4 guarantees that whenever m is relatively prime to 7, it has a multiplicative 
inverse in Z,, and according to Exercise 4.2.26 this is actually the complete answer. 
Namely, if 7 is not relatively prime to 7, then it does not possess a multiplicative inverse 
in Z,. Thus 1 and 5 are the only elements of Z, that have multiplicative inverses 
whereas 1, 3, 5, and 7 are the only elements of Z, that have multiplicative inverses. 
This raises the interesting question of just how many elements of Z, do in general have 
such inverses. As the resulting formula has some bearing on the complex roots of unity 
and on other subsequent issues, it will be derived here. For any positive integer 7 let 
¢(n) denote the number of positive integers not greater than 7 that are relatively prime 
to n. This is known as the Euler p-function. As noted above, g(6) = 2 and (8) = 4. Ic 
is clear that if p is a prime, then g(p) = p — 1, since every positive number less than 
p is relatively prime to p. In fact, if p is a prime and m is a positive integer, then 
op”) = p™ — p™” since the only numbers between 1 and p” that are not relatively 
prime to p” are p,2p,3p,4p,...,(p™ ')p, and there are clearly exactly p”~! of those. 
As every number 7 can be factored into the form p,"' p;? +++ p,* where p,, Py»---> Py 
are distinct primes, it is now clear that the following lemma will eventually provide the 


complete answer. 
Lemma 5.23 If m and 7 are relatively prime positive integers, then ¢p( 77) = gm )g(7). 


Proof. Let { = cos2x/m + isin2x/m and n= cos2x/n+isin2x/n. It follows from 
Proposition 5.19 that (&, #) = 1 if and only if g* is a primitive m-th root of unity. 
Hence we can interpret g(m) as the number of primitive m-th roots of unity. The 
proposition will be proved by demonstrating that all of the primitive m-th roots of 
unity are obtained, without repetition, when an arbitrary primitive m-th root of unity is 
multiplied by an arbitrary primitive 7-th root of unity. 

We first dispose of the possible redundancies in this process. Thus, suppose that £7 
and ¢* are two primitive m-th roots of unity with 1 < a,x < m, that n° and 7 are two 
primitive #-th roots of unity with 1 < 6, y < 7, and that tn? =o% y). Then 07% = 9? 


and. consequently 
hems (geet a (iby = (grb = (IF = 1, 


Since { is a primitive m-th root of unity it follows that m is a divisor of (a— x)”. 


However, m and n are relatively prime, and so we may conclude that m is a divisor of 


94 THE BINOMIAL THEOREM AND MODULAR POWERS 


(a— x) alone. Since both a and x are between 0 and m, it follows that a =x. A similar 
argument allows us to conclude that 6 = y. Thus it has been demonstrated that the set 
of all products of primitive m-th roots by primitive 2-th roots contains exactly p(m)o(7) 
distinct elements. 

It follows from Proposition 5.20 that each of the products c4n°, with ¢, , a, and 6 
as above, has order mn and is therefore a primitive mn-th root of unity. It therefore only 
remains to show that every primitive mn-th root is accounted for by this process. 


Let @ be any primitive m-th root of unity, and let A and B be two integers such that 
Am+Bn=1. (5.24) 


Then clearly a = a4”a3". Now (04”)” = (a””")4 =(1)4 = 1, and so a4” is an n-th 
root of unity. We know from Equation 5.24 that (A, 7) = 1. It therefore follows from 
Proposition 5.19 that a4” is in fact primitive, since 

mn mn 


ae) = (Am, mn) = m(A, n) =a 


A similar argument establishes that a2” is a primitive m-th root of unity, and so products 
of the primitive m-th roots of unity with the primitive -th roots do indeed cover all the 


primitive mn-th roots, each exactly once. Thus, p(m)o(2) = (mn) . a 


We are now ready to derive an explicit formula for the number of positive integers that 
are both less than 7 and relatively prime to it. By Proposition 5.19, this is also equal to 


the number of primitive 7-th roots of unity. 


Theorem 5.25 If # is any number with prime factorization Pe s .- iy , then 


Proof’ By Lemma 5.23, it suffices to prove this theorem for the case where 7 is the power 
of a single prime, i.e., where there exist a prime number p and an integer m such that 
n= p™. However, as was noted just prior to Lemma 5.23, o(pm) = p”™ — p”', and so 


the theorem now follows immediately. a 


For example, since 100 = 2757, it follows that g(100) = (27 — 2)(5? — 5) = 40. 


EXERCISES 95 


Private Key: n=pq, o(n)=(p—1\q—-1), 
ee Zy,)s d =e (mod o(n)); 
Public Key:  e and 7; 


Encryption: c= m* (mod n); 


Decryption: c? =(m*)* = m (mod (n)). 


Table 5.2 — Rsa encryption 


Number theory in general and Euler’s g-function in particular were considered to be 
pure, as opposed to applied, mathematics. In the 1970s, however, mathematicians found 
a very useful application of elementary number theory to encryption and decryption. 

The purpose of encryption is to secure communications and we propose to do this 
by means of a method that relies on the mathematics of the previous section. Suppose 
Xenon has a multitude of customers and he wishes to establish secure communications 
with each of his customers. He consults Fermat who sells him his two latest primes p 
and q which Xenon hides as a tattoo on his body. Xenon also computes and displays on 
his website, as part of the Public Key, both the product = pq and a positive integer ¢ 
such that 1 < e < g(#) and ccp(e, g(n)) = 1. 

And so when the client wishes to communicate something to Xenon, she translates 
the message into a number m with 1 < m <n, which she promptly raises to the e-th 
power and sends to Xenon. If the message m° reaches Xenon unmolested, all he has to 
do to decrypt the message is to raise m* to the d-th power, where d = e~' (mod ¢(7)) 
(see Table 5.2). 

If, however, the Red Baron intercepts this message, he still does not know the values of 
y(n) and since he, the Red Baron, does not know (7), he cannot find d notwithstanding 


the fact that he does know e. 


Exercises 5.4 


1. Compute 9(24), o(144), and 9(1,000). 
2. Prove that g(7) is even for 2 > 2. 
3- For what values of 7 is o() a prime number? Justify your answer. 


4. For what values of is g() the power of a single prime number? Justify your 


answer, 


96 THE BINOMIAL THEOREM AND MODULAR POWERS 


5- True or false: there are an infinite number of integers ” such that g(7) < 100. 


Justify your answer. 
6. Prove that if 7 is any positive integer, then 


> ed) = n. 


d|n 


7 Let mand n> 1 be positive integers such that g(mn) = o(m). Prove that n = 2 


and m is odd. 


8. Prove that if g =(m,n), then o(mn)= go(m)o(n)/o(g). 
9. Prove that if d | 2, then 9(@) | ox). 


Chapter Summary 


Having proved the Binomial Theorem, we used it to derive Fermat’s Theorem for ex- 
ponents in arithmetic modulo p, which effectively states that the nonzero elements of 
Z, are all (p — 1)-th roots of unity. This allowed us to extend the notion of order to 
modular arithmetic and we derived some new theorems regarding the orders of roots of 


unity which apply to both the complex and the modular ones. 
Chapter Review Exercises 


Mark the following true or false. 


t.  (3)=n(n—1)/2. a €2 oe 
3. The number of pairs that can be formed by selecting two elements from the set 


{a, b,c, } is 8. 
4. 127=1 (mod 17). 
5. (3+8)'1=3!!48!! (mod 11). 


6. 13°°=1 (mod 31). 
7. If o(f)= 144, then o(f!7°) = 102. 
8 (xtytzP ax? typ tert+Sxty + Sxyt + 5xtc + 5xzt + Sytz + Syzit 


10x? y? + 10x3y? + 10x72 + 10x3z* + 10y?z9 + 10y32?. 
9. p(n) <n for all integers n > 2. 


10. g(15)=8. 


CHAPTER SUMMARY 97 


New Terms 

binomial coefficient, 76 modular roots of unity, 87 
Euler @-function, 93 Pascal’s Identity, 77 
Fibonacci numbers, 84 Pascal’s Triangle, 77 
method of generating functions, 81 primitive root, 88 


modular order, 87 


Supplementary Exercises 
1. Write a computer script which computes the order of any element modulo p. 
2. Write a computer script that evaluates (*) for any two positive integers m > n. 
3- Find lim, o(7). 
4. Let F, be the Fibonacci number of Exercise 5.2.22. Prove that (F,,, F,,) = Fim, n) 
and investigate the question of which Fibonacci numbers are prime. 
5+ For which positive integers n does Z, have an element of order g(x)? 
6. For each positive integer 7 and a € Z,, investigate the length of the (eventually) 
repeating segment of a,a’,a°,a‘,.... 
7: Find some more Carmichael numbers. 


Chapter’ 6 


POLYNOMIALS OVER A FIELD 


ii HE FOCUS now shifts to the topic of polynomials. Since polynomials have numerical 
coefficients, and since we have by now encountered a great variety of disparate 
number systems, the polynomials are studied in the more general setting of abstract fields. 
We are mainly concerned here with the factorization of polynomials in one variable, but 
some attention is also given to the symmetric polynomials in several variables and their 


utility in solving the general quartic equation. 


6.1 Fields and Their Polynomials 


We have by now encountered a host of algebraic structures within which the four arith- 
metical operations hold sway. These are the real numbers, the rational numbers, the 
complex numbers, and arithmetic modulo p where p is a prime. Another collection 
of such structures will be studied in detail in Chapter 7, and mathematicians have con- 
structed many others that will not be mentioned here. It stands to reason that algebraic 
structures with such strong similarities will share yet other properties, and these are this 
chapter's concern. The notion of a field is used to extract the properties that are common 
to all these similar structures. 

A field is a set F with two binary operations, usually denoted by + and -, for which 


the following hold. For any elements a, 6, and c of F, 


at+beF and a-beF (closure), 


(a+ 6)+c=a+(b+c) and (a-6)-c=a-(b-c) (associativity), 


a+b=b+a and a-b=b-a (commutativity), 
a-(b6+c)=a-b+a-c (distributivity), 
Introductory Modern Algebra, Second Edition. 99 


By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


100 POLYNOMIALS OVER A FIELD 
there exist distinct elements 0, 1 ¢ F such that 
a+Q=a and a-l=l (identities), 
there exists an element —a € F such that 
a+(-a)=0 (additive inverse), 
and if a #0, then there exists an element a! €F such that 
a-a'=1 (multiplicative inverse). 


It will prove useful to label the most familiar fields with some symbols. Accordingly, 
Q, R, and C will denote the rational, real, and complex number systems, respectively. 
The fact that all these number systems are indeed fields will not be belabored here. It 
should be noted, however, that not all algebraic structures are necessarily fields. If » is a 
composite number, then Z,, is not a field since, as was noted in Chapter 4, no divisor of 7 
except 1 has a multiplicative inverse in Z, . The set of polynomials with real coefficients is 
another example of an algebraic structure that is not a field. It is easy to convince oneself 
that the normal addition and multiplication of such polynomials have all the properties 
required of a field, except for the last one. The multiplicative inverses of polynomials are 
not polynomials. For example, there is no polynomial whose product with x + 1 is 1 
(Exercise 6.1.4). 

Since the above-listed properties are shared by all fields, it is clear that any proposition 
whose justification relies only on these common properties is necessarily valid in all fields. 
In particular, it will hold for Q, R, C, for all Z, with p prime, and for the new fields to 


be introduced in the next chapter. The following is an example of such a proposition. 


Proposition 6.1 If a and 6 are two elements of the field F, then a- 6 = 0 if and only 


if either 2 or 6 is zero. 


Proof. Set x =a-0. By the distributivity of addition and multiplication, 
x=a-0=a4-(04+0)=4-04+4-0=x4%. 
Consequently, 


a-O0=x=x+4+(x+(—x))=(x+x)4+(—x) =x 4+(—x)=0. 


FIELDS AND THEIR POLYNOMIALS 101 
Conversely, suppose a- 6 = 0 but a #0. Then a has a multiplicative inverse a~' and so 
O=a!-0=a!-(a-6)=(a!-a)-b=1-b=6. 


Hence, either a or 6 is zero. r 


It is important to note that Proposition 6.1 does not hold in all algebraic structures. 
Thus, in Z,, 2-3 =0 even though neither 2 nor 3 is zero. Another, more substantial 
example of a proposition that does hold for all fields is the Binomial Theorem. It was 
already stated in Section 5.1 that the proof of this theorem holds regardless of whether the 
numbers in question are complex or modular. In fact, the proof of Theorem 5.3 carries 
over verbatim to arbitrary fields once the meaning of the terms is clarified. For any positive 
integer m and any element a of some field F, let a” denote the product of m a’s, and 
let ma denote the sum of m a’s. Accordingly, a? =a-a-a and 3a=a+arta. If misa 
negative integer we define a” =(a~!)-™ and ma =—(—m)a. Finally, we set 2° = 1 and 
Oa = 0. It is easily seen that such identities as a” -a” =a”*” and ma+na=(m+n)a 
hold in this generalized context just as they do for the complex and modular fields 


(Exercises 6.1.24 and 6.1.25). 


Theorem 6.2 (The Binomial Theorem) Let F bea field, 2,6 ¢F, and let n bea 


nonnegative integer. Then 


(a+ 6b)" = (3)e + (Ter beet (j Jertot fenet (”) pe. 


We now go on to show that many of the properties of polynomials with real coefficients 
are also valid when the coefficients are allowed to be the members of an arbitrary field. 
This will be accomplished by proving that the validity of these properties follows from the 
defining properties of fields listed above alone. For the sake of simplifying the notation 
we adopt here the usual convention that the product a- 6 is abbreviated to ab. 

Given a field F, the variables x, y,z,... are symbols, or place holders, which can be 
replaced by elements of F. A polynomial in x over F is an expression that is obtained by 
applying the operations of addition and multiplication to the variable x and/or some of 
the elements of F. Thus, both 17x? —(31/2)x and 5x? + 6x +(—6)x? +1+(2/9)x are 
polynomials in x over Q. Actually, they can also be interpreted as polynomials over R, 


over C, and over Z, or any Z, (for p # 2,3) for that matter. On the other hand, 


x3 4 ix? +1—3i 


102 POLYNOMIALS OVER A FIELD 


is a polynomial over the field of complex numbers C, but not over either the real numbers 


or the rational numbers. Moreover, since in Z, 
2? =3?=4=-1 (mod 5) 


it follows that in Z, the quantity i= y¥—1 can be interpreted as either 2 or 3, and so 
x? +ix? + 1—3i can be interpreted as a polynomial over Z,. On the other hand, since Z, 
contains no number a such that a” = 2 =—1 (mod 3), the polynomial x* + ix? + 1—3i 
cannot be regarded as a polynomial over Z,. 

The set of polynomials in x over F is denoted by F[x] and its members will generally 
be denoted by P(x), Q(x),.... If P(x) ¢ F[x], then we also say that F is the ground 
freld of P(x). The two polynomials 


P(x) = ayx” +4,x""| tax"? +---+4, x +4, 


and 
Q(x) = byx™ + bx | + bx te tb x tb, 


are said to be equal if and only if m=n and a, = 6, for i=0,1,2,...,.m=n. In 
particular, the polynomials x? and x are considered to be distinct in Z;[x] even though 
mW =n forall n€ Z,. The reason for this fine distinction will become clear in the next 
chapter. 

Polynomials over an arbitrary field F can be added, subtracted, and multiplied, and 
these operations possess the usual properties of commutativity, associativity, and distribu- 


tivity. Thus, the following two polynomials over Z, are added and multiplied as 
(2x3 + 3x +1) + (4x3 +2) = 6x3 43x43 =x794+3x43 


and 


(2x3 + 3x + 1)(4x? +2) = 8x9 + 4x? + 12x44+6x44x° +42 
= 8x94 12x44 8x3 46x +2 = 3x9 42x44 3x3 4042. 
The details of these examples indicate that it is always necessary to keep the ground 


field in mind, since the final result clearly depends on which field the coefficients belong 


to. Many significant properties of a polynomial also depend on the ground field. The 


FIELDS AND THEIR POLYNOMIALS 103 


polynomial x? + 1 factors over the complex numbers, since x” + 1 =(x +i)(x —i), bue 
it is well known that this polynomial cannot be factored over either the rationals or the 
real numbers. The same polynomial factors over Z, as x” + 1 = (x + 2)(x +3), but does 
not factor over Z., (see Proposition 6.6 below). 

The polynomial 0 is called the zero polynomial. If P(x) is any nonzero polynomial, 


then it can clearly be written in the standard form 
n n-1 n—2 
Ax” +a, x” tax"? +e--+a,  xta,, a #0. 


If a) = 1, the polynomial is said to a monic polynomial. The exponent x is the degree 
of P(x). No degree is assigned to the zero polynomial. A polynomial of degree 0 is 
said to be a constant polynomial. The zero polynomial is also considered to be a constant 
polynomial. The proof of the following proposition is straightforward and is relegated to 


Exercises 6.1.22 and 6.1.23. 


Proposition 6.3 Let P(x) and Q(x) be polynomials of degrees m and x. Then if 
P(x) + Q(x) is nonzero, then degree of P(x) + Q(x) is at most max{ m,n} and the 
degree of P(x)Q(x) equals m+n. 


We shall now address the issue of division of polynomials in F |x]. Much like the 
integers, polynomials are also subject to a process of long division. Because of the 
fundamental significance of this process, we shall prove its validity for polynomials over 
arbitrary ground fields. For the sake of completeness we next offer a proposition that 
justifies the process of long division in the general context of fields. The examples that 
follow the proof should clarify it and may actually obviate the need for such a proof. 
Given any two elements a and 6 £0 of a field F, we use the symbol a/6 to denote 


a- 67! in the usual way. 


Proposition 6.4 If P(x) and D(x) #0 are two polynomials over F, then there exist 
polynomials Q(x), R(x) € F[x] such that 


P(x) = D(x)Q(x) + R(x) (6.5) 


and if R(x) is not the zero polynomial, then the degree of R(x) is less than the degree of 
D(x). 


104 POLYNOMIALS OVER A FIELD 


Proof. Suppose 


P(x) =ayx”™ tax"! +--+, 


with a, #0 and 
D(x) = byx4 + bxt 1 +--+, 


with b, #0. If m<d then we can clearly choose Q(x) =0 and R(x) = P(x). Hence 
we may assume that m > d. We now proceed by induction on m and assume that the 
theorem holds for all pairs of polynomials P(x), D(x) with degrees less than some fixed 


integer m>d. Let P(x) and D(x) be as given and define the new polynomial 


a 
P(x) = Bx D(x) 
0 


O md (byx4 + b,x4" treet b,) 
0 
b b 
= ayx" + “071 ym feet “one ance 
by 0 
Then, because P(x) and P,(x) have the same degree and the same first coefficient, it 
follows that P(x)— P,(x) is either 0 or else it is a polynomial of degree less than their 
common degree m. In the first case P(x) = P,(x) and we can use Q(x) = (a5 / b,x" 
and R(x) =0 to obtain Equation 6.5. In the second case we use the induction hypothesis 
on the degrees. Accordingly, there exist polynomials Q(x) and R(x), with R(x) either 
0 or else of degree less than d, such that P(x)— P,(x) = Q,(x)D(x)+ R(x). But then 


P(x) = P\(x)+ Q,(x)D(x)+ R(x) 2x”4 D(x) + Q,(x)D(x)+ R(x) 
0 


= |e + Qui] D(x) + R(x), 


0 


and so, with Q(x) = (a)/6,)x m4 + Q (x), the proof is concluded. a 


FIELDS AND THEIR POLYNOMIALS 105 


The proof of the above proposition, like most inductive proofs, is in fact constructive 
and contains a method for finding the quotient Q(x) and the remainder R(x) of any long 
division of polynomials. We demonstrate this by dividing the polynomial x° + x4 +x? +1 
by the polynomial x7 +x +1. 


x tx] 
eee tl | x? + x4 + x? +1 
x? + x3 + x? 
yes 

x4 +x7 4% 
—x—x%-—x 41 
- x3 —-x-l1 

— x? +2 


In this case, where the coefficients are real, Q(x) = x* +x —1 and R(x) =—x?+2. On 


the other hand, if the coefficients are taken as elements of Z, we get 


xr-txt1 
xtxt 1] x? + x4 + x? +1 
x +x? + x? 
xi + x? 
x4 +x +x 


106 


POLYNOMIALS OVER A FIELD 


so that here Q(x) =x? +x+1 and R(x) =x. Finally, if the division is undertaken over 


Z; , we get 


xe-+ x4] 
x+x4 1] x? + x4 + x? +1 
x? oa xo + x? 
x4 + 2x? 
x4 +x +x 


2x3 + 2x7 +2x 41 
2x3 + 2x +2 


2x? +2 


so that in this case Q(x) =x? +x +2 and R(x) = 2(x? +1). 


Exercises 6.1 


Find the quotient and remainder when x” + x4 +x +1 is divided by x? +x? +1 


over Z,. 
Repeat Exercise 6.1.1 over Z;. 
Repeat Exercise 6.1.1 over Z,. 


Prove that there is no polynomial P(x) with real coefficients such that P(x)(x +1) = 
1. 


Prove that if 2 is a composite number, then Z, has nonzero elements a and 6 such 
that 2b =0. 


Prove that the identities in Exercises 6.1.6 to 6.1.9 hold in every field. 


6. 


7 
8. 
9 


To. 


II. 


(a+ ba—b)=a°-b* 

(a+ b)(a*?—-ab+b*)=a°+6° 
(a—b)(a?+ab+6?)=a°—-6 
tata? deta" = ifa#l 


Expand (x + 1)° over Z,. 12. Expand (2x + 3)° over Z,. 
Expand (x + 1)° over Z;. 13. Expand (2x +3)° over Z,. 


THE FACTORIZATION OF POLYNOMIALS 107 


Suppose a, 6, c, d, e, and f are nonzero elements of field F such that 


Prove the identities in Exercises 6.1.14 to 6.1.17 whenever the denominators in question 


14. (a+ 6)/(a— 6) =(c+d)/(e—d) 

15. af/b=(atcte)/(b+d+f) 

16. a/b =(a+2c—3e)/(b+2d—3f) 

17. afb = / (a? —15c? + 9e?)/(b3 — 15d? +9f?) 


Let F be any field and let a and x be elements of F. Prove the statements in Exer- 


Cises 6.1.18 to 6.1.21. 


18. Ifat+x =a, then x =0. 20. Ifa+x=0, then x =—a. 


19. Ifax =a and a#0, then x=1. 21. Ifax=1,thenx =a, 


22. Prove the first part of Proposition 6.3 and explain why equality need not hold. 


23. Prove the second part of Proposition 6.3. 


m+n and 


24. Prove that if m and 7 are positive integers and a ¢ F, then a”- a” =a 
(a”)” = a™”., 

25. Prove that if m and n are integers and a¢ F, then ma+na=(m+n)a and 
m(na) =(mn)a. 


26. Prove Exercise 6.1.24 when m and 7 are arbitrary integers. 


6.2 The Factorization of Polynomials 


If P(x) is a polynomial in F [x], and if a is an element of the field F such that P(a)=0, 
then we say that a is a zero of P(x). Thus, 
» —3 isa zero of 2x +6 if F =Q, 
« 5 isa zero of x? +3x if F =Z,, 
« /7 isa zero of x? —7 if F =R, and 
® 1+iisazero of x4+4 if F=C. 
Every precalculus algebra text contains a proposition that relates the zeros of a poly- 


nomial to its factorization. The same relationship holds for polynomials over arbitrary 


fields. 


108 POLYNOMIALS OVER A FIELD 


Proposition 6.6 Let P(x) be a polynomial over the field F, Then a € F is a zero of 


P(x) ifand only if there exists a polynomial Q(x) such that P(x) = Q(x)(x —a). 
Proof. If such a polynomial Q(x) does exist, then clearly P(a) = Q(a)(a—a)=0, and 


so a is indeed a zero of P(x). Conversely, suppose a is a zero of P(x). Set D(x) = 
x —a, and let Q(x) and R(x) be the polynomials whose existence is guaranteed by 
Proposition 6.4. Then P(x) = Q(x)(x—a)+ R(x). If R(x) is the zero polynomial, we are 
done. Otherwise the degree of R(x) is less than that of D(x) = x —a which is 1. Hence 
R(x) is a polynomial of degree zero, so that for some r € F, P(x) = Q(x)(x—a)+r. 
If we now substitute x = a, we obtain 0 = P(a) = Q(a)(a—a)+r=0+7r=r7, so that 


P(x) = Q(x)(x —a). a 


This proposition has a corollary whose straightforward inductive proof is relegated to 
Exercise 6.2.22. 
Corollary 6.7 If P(x) is a polynomial over the field F and if a,,4,,...,4, €F are 


distinct zeroes of P(x), then there exists a polynomial Q(x) such that 


P(x) = Q( aoe = ay oe) (ay) 


If P(x), Q(x), and R(x) are polynomials over F such that P(x) = Q(x)R(x), then we 
say that Q(x)R(x) is a factorization of P(x) over F,, and that Q(x) and R(x) are factors 
or divisors of P(x). If P(x) is a nonconstant polynomial such that in every factorization 
P(x) = Q(x)R(x) either Q(x) or R(x) has degree 0 (i.e., is a nonzero constant), then 
P(x) is said to be irreducible over F. A polynomial that is not irreducible is called 
reducible or factorable. 

The above proposition greatly facilitates the task of factoring polynomials. The polyno- 
mial P,(x) = x* +x +3 has the two zeroes 1 and 3 over Z, since P,(1) = 5 =0 (mod 5) 
and P,(3)=15=0 (mod 5). Hence it follows from Proposition 6.6 that (x — 1) =(x +4) 
and (x —3) = (x +2) are factors of x” + x +3 (over Z,). Since x* + x +3 has degree 2, it 
can have at most wo factors. Consequently, by Corollary 6.7, x? +x +3 =(x +2)(x +4) 
over Z,. 

Similarly, the polynomial P,(x) = 2x* + 2x +3 has the two zeroes 1 and 5 (mod 7). 
Thus (x — 1)(* —5) = (x + 6)(x + 2) is a divisor of P,(x). Since the leading coefficient 
of P,(x) is 2 it follows that 2x? + 2x +3 = 2(x +2)(x +6) over Z,. 

The polynomial P,(x) = x? + x + 1 has no zeroes in Z, since P,(0) = 1 #0 (mod 2) 
and P;(1) = 3 #0 (mod 2). Since every nonconstant factor of P,(x) would have form 


x—a,ae€{0,1}, and since neither 0 nor 1 are zeros of P;(x), it now follows from 


THE FACTORIZATION OF POLYNOMIALS 109 


Proposition 6.6 that this polynomial is irreducible over Z,. On the other hand, since 
P,(1) = 0 (mod 3), it follows that P,(x) does factor over Z, in a nontrivial way. 
It is easy to read too much into Corollary 6.7. The polynomial x* + x? +1 has no 


zeroes in Z, for the same reasons that P,(x) does not. Nevertheless, 


(x? +41 =(x? +x 41)(x? +x 41) 


Haxtex tx rt x eter gute rtxtlaxtte24 1. 


Thus, Corollary 6.7 only supplies information about first-degree factors. It may fail to 
detect the existence of factors of a higher degree. 

The question of which polynomials of degree greater than 3 are irreducible is quite 
difficult and cannot be discussed here in its full generality. In the case where the ground 
field is Z,, however, the finiteness of p can be used to make some headway. Thus, 
the irreducible polynomials of degree 1, 2, or 3 over Z, are (Exercise 6.2.1) x, x +1, 
xrtxt1, x t+x41, and x9 +%?+1. Hence the list of reducible fourth-degree 


polynomials over Z, is 


Pa (x +1 (x? +e41)=x44x3 4x41, 
(xt l)=xtt+x3, (x? +41)? = 244x741, 

x(x +1 = xt +x?, x(xP+xt1)=xt+x7 +x, 
x(x+1P=xt+x3 4+x74+-%, x(x tx724lhaxttx i +x, 
(x+1)§=x44+1, (x+1)(x?+x41=xt+x3 4x7 41, 
x(x? 4x41) axttx3 +x, (x +1)(x?+x741l)=xttx2tx41, 


x(x +1)(x?+x4+1)=x44+x, 


The remaining three fourth-degree polynomials over Z,, namely, x4 +x? +1, x4 + 
x+1,and x44x3+x? +x +1, must therefore be irreducible. 

Ic is important to realize that the same polynomial may be factorable over one field 
and irreducible over another. Thus, we saw above that x” + x + 1 is irreducible over Z, 
whereas it is easily verified that x* + x + 1 =(x +2)* over Z,. Similarly, the polynomial 
x* +1 is irreducible over R (it has degree 2 and no zeroes in R) whereas it factors into 
(x + i)(x —i) over C. 

We conclude this section by extending to arbitrary fields yet another fact that is well 


known for polynomials over the real numbers. 


110 POLYNOMIALS OVER A FIELD 


Proposition 6.8 If P(x) is a polynomial of degree » over any field F, then the equation 


P(x) =0 has at most » distinct solutions. 


Proof. This is proved by induction on n. When x =0, P(x) must be a constant poly- 
nomial ¢ for some c #0. In this case the polynomial equation P(x) = 0 has the form 
¢ = 0 which has no (i.e., zero) solutions. Hence the induction process has been anchored 
atn=0. 

Let P(x) be a polynomial of degree n > 0 and suppose that the theorem has been 
proved for all polynomials of degree less than 1. If P(x) has no zeroes, then we are done. 
Suppose, therefore, that r ¢ F isa zero of P(x). Proposition 6.6 implies the existence of a 
polynomial Q(x) over F such that P(x) = Q(x)(x — 1). It is clear that Q(x) has degree 
n—1. If s is any zero of P(x) that is distinct from 7, then 0 = P(s) = Q(s)(s —r). 
Since s is distinct from r, it follows that s — r #0 and hence, by Proposition 6.1, we 
may conclude that Q(s)=0. Thus, all the zeroes of P(x) that are distinct from r are 
zeroes of Q(x). Since Q(x) has degree 2 — 1, there are, by the induction hypothesis, at 
most 2—1 zeroes of P(x) that are distinct from 7. In other words, P(x) has at most 


distinct zeroes. r] 


The actual number of distinct solutions of a polynomial equation will vary with the 


polynomial. It is easily verified by direct substitution that the equation 
x? +x?+x+1=0 (mod 5) 
has the solutions x =2, x = 3, and x = 4 in Z,, whereas the equation 
x? +3x+4=0 (mod 5) 


has only two distinct solutions, namely, x = 3 and x = 4. However, since over Z, we 
have x° + 3x +4 =(x —3)?(x —4), we say that 3 is a double zero of the polynomial 
x? + 3x +4, and consequently, counting multiplicities, this polynomial has three zeroes. 
In general, if r is any zero of the polynomial P(x), it is said to have multiplicity m if m 
is the largest integer such that (x — r)” divides P(x). 

Thus, as was seen above, 3 and 4 are zeroes of multiplicities 2 and 1, respectively, of the 
polynomial x° + 3x +4 over Z,. A slight modification of the proof of Proposition 6.8, 
together with the Fundamental Theorem of Algebra that was stated without proof in 


Section 3.3, yields the following fact whose proof is relegated to Exercise 6.2.20. 


THE FACTORIZATION OF POLYNOMIALS 111 


Proposition 6.9 Counting multiplicities, every polynomial of degree 7 over the complex 


numbers has exactly 7 complex zeroes. 


This proposition is in some sense also true even when P(:x) has its coefficients in other 
fields (Exercise 10.3.23). 

Amongst the consequences of the Fundamental Theorem of Algebra are the following 
corollaries: 
Lemma 6.10 If P(x) is a polynomial with real coefficients, then r is a zero of P(x) if 


and only if 7 is one too. 


Proof. Let 


k—-1 


P(x) =agx* ta,x* | +a,x'? 4+---+a, x tay. 


Because the coefficients ap, 4,,...,4, are real, it follows that 


——— k : k ; k . 
PR) = Slax! = aR! = ae! = Pia), 
7=0 i=0 i= 
Consequently r is a zero of P(x) if and only 7 is such a zero too. a 


Corollary 6.11 If P(x) is a polynomial with real coefficients, then it can be factored 


into irreducible real polynomials of degree at most 2. 


Proof. Let P(x) = ayx* +a,x*"! tax? +--. +a,_,x +4,. By Proposition 6.9 there 


exist complex numbers 7,, 75,..., 7, such that 
vl k 


k 
P(x)= a] [( =7,): 


By the above lemma, if P(x) has m real zeros, then the number of imaginary zeros is 
(k—m)/2 and P(x) factors into 
m (k—m)/2 
sol Je— 0) T] ba e—5) 
ixi j=l 


where each 7, is real and each s, isa nonreal imaginary number. However, 


= a Py Sy2= rm rm 
(x-s,)(x-5 =x (5, +5)x +555 


where both s; +5; and 5,5; are real. Thus, P(x) has been factored into linear and 


quadratic irreducible factors. a 


112 


POLYNOMIALS OVER A FIELD 


Exercises 6.2 


eo Sr os 


Io. 


Il. 


List and completely factor all the polynomials of degree d < 4 over Z,. 

List and completely factor all the polynomials of the form x° +. ax? + bx +c over 
Z,. 

List and completely factor all the monic quadratic polynomials over Z,. 

List and completely factor all the cubic polynomials of the form x? + x* +ax +6 
over Z;. 

Completely factor all the polynomials of the form x? + ax +1 over Z,. 

Prove that the number of irreducible monic quadratic polynomials over Z, is (Ei; 


Find a formula for the number of irreducible monic cubic polynomials over Z, . 
Prove that if F is a finite field, then there is a quadratic polynomial in F [x] that is 
irreducible over F. 

Suppose the polynomial a)x” + a,x"! +---+4,_,x+4, is irreducible over a field 
F, with ay,4, #0. Prove that the polynomial a,x” +4,_,x” 1 +++-+a,x + dp is 
also irreducible over F. 

Prove that if the polynomial P(x) ¢ F[x] is divided by x —a, then the remainder 
is P(a). 

Let a be any element of the field F, and let P(x) be a polynomial over F. Prove 


that P(x) is irreducible over F if and only if P(x + a) is irreducible over F. 


Find the remainder when P(x) = x?° — 2x4 + 3x3 — 4x? +5x —1¢€C[x] is divided by 


the polynomials in Exercises 6.2.12 to 6.2.15. 


12. 


16, 


17. 
18. 


19. 


21. 
22. 


23. 


x—1 13. xe x 14. x?-1 15. xe +) 


For which values of a and 6 will x? +2 bea factor of x'? +ax + 6 over Z,? 
Repeat Exercise 6.2.16 over Z,,. 
Let F be an arbitrary field. Do there exist 2, 6 ¢ F such that x? +2 is a divisor of 


x 4ax+ 6 over F? 
Repeat Exercise 6.2.16 over R. 20. Repeat Exercise 6.2.16 over C. 
Prove Proposition 6.9 assuming the Fundamental Theorem of Algebra (Section 3.3). 


Prove Corollary 6.7. 


Prove that the polynomial xt t xr tx? +41 is irreducible over Q. 


THE EUCLIDEAN ALGORITHM FOR POLYNOMIALS 113 


M,_\(x) = Q,1(*)N,_(x) = Ry) 
ee ame 
M,(x) = Q(x) N,(x) = R(x) 


oe 


M(x) = Qi )Niaile) = Rail) 
Figure 6.1 The Euclidean algorithm for polynomials 


6.3 The Euclidean Algorithm for Polynomials 


The process of long division was used once before in Chapter 4 in connection with 
the Euclidean algorithm for finding the greatest common divisor of two integers. It 
proves equally handy in finding the greatest common divisor of two polynomials. We 
define a greatest common divisor of the polynomials M(x) and N(x) as a common divisor 
of maximum degree. The reason for the use of the indefinite article in this definition 
is that if the polynomial P(x) is a divisor of Q(x) and ¢ is any nonzero number in 
the ground field, then so is ¢P(x) a divisor of Q(x). For if Q(x) = P(x)R(x), then 
also Q(x) = [cP(x)][(1/c)R(x)]. Thus, if G(x) is any greatest common divisor of two 
polynomials, then so is ¢G(x) whenever c # 0. Exercise 6.3.8 calls for a proof that any 
two greatest common divisors of M(x) and N(x) are indeed such multiples of each other, 
but until then some caution must be exercised. 

To find a greatest common divisor of two given polynomials it suffices to mimic the 
Euclidean algorithm for integers. We begin with a lemma which implies that the long 
division process of Proposition 6.4 can be used to reduce the degrees of the polynomials 


in question. 


Lemma 6.12 Suppose P(x), Q(x), D(x), and R(x) are polynomials over the field F 
such that P(x) = D(x)Q(x)+ R(x), then every greatest common divisor of P(x) and 


D(x) is also a greatest common divisor of D(x) and R(x) and vice versa. 


Proof’ Every common divisor of D(x) and R(x) is also a divisor of D(x)P(x) + R(x) = 
P(x), and hence it is also a common divisor of D(x) and P(x). Conversely, every 
common divisor of P(x) and D(x) is also a divisor of P(x)— D(x)Q(x) = R(x), and 
hence it is also a common divisor of D(x) and R(x). The complete statement of the 


lemma now follows immediately. 7 


114 POLYNOMIALS OVER A FIELD 


Let M(x) and N(x) be two polynomials. Set M,(x) = M(x) and N,(x) = N(x) and 


let Q,(x) and R,(x) be the appropriate quotient and remainder so that 
M(x) = Qy(x)N (x) + R,(x) 
For i = 1,2,3,... we set 
M,.,(s)=N(x) and N,(x)= R(x) (6.13) 


with Q,,,(x) and R, 


i+] 


is divided by N,, ,(x). Figure 6.1 should be helpful. 


Note that if R,(x) is not the zero polynomial, then either R, , ,(x) is the zero polynomial 


(x) being the respective quotient and remainder when M,, ,(x) 


or else 


degree of R,, ,(x) < degree of N,, (x) = degree of R,(x). 


+1 


Hence, this procedure is bound to eventually produce a remainder R,(x) which is the 
zero polynomial, at which point the algorithm stops. We claim that R,_,(x) is a greatest 
common divisor of M(x) and N(x). To see this first note that the bottom line of 


Figure 6.1 is equivalent to 
R,_4(*) = Q,4, (0) R(x) +R )(%) £=2,3,..,8—-1. 


Hence, by the above lemma, for i = 2,3,...,4—1 any greatest common divisor of 
R,_,(x) and R,(x) is also a greatest common divisor of R,(x) and R,, ;(x) and vice versa. 

A similar argument allows us to conclude that any greatest common divisor of R,(x) 
and R,(x) is also a greatest common divisor of M(x) = M,(x) and N(x) = N,(x) (Ex- 
ercise 6.3.20). Hence, since R,_,(x) is a greatest common divisor of R,_,(x) and R,(x), 
it is also a greatest common divisor of M(x) and N(x). 


If the ground field is Z, and 
M(x)=M(x)=x8tx74 x 4x4 tx3 txt 


and 


N(x)HN (xa? txt xP tx2+x41, 


THE EUCLIDEAN ALGORITHM FOR POLYNOMIALS 115 
then two long divisions yield 


M,(x)=N,(x) =x? txt tx t+x72 +e 41, 
N,(x)= R(x) = x4 tx3 +x, 
M,(x) = N,(X)= xi 4x3 4x2, 


N,(x) = R,(x) = x+x41, 


and R,(x)=0. Thus R,(x) = x? +x +1 is the required greatest common divisor of 
M(x) and N(x). 
It will subsequently prove useful to have a polynomial version of Proposition 4.1 


available. 


Proposition 6.14 If G(x) is a greatest common divisor of the polynomials M(x) and 
N(x) over the field F, then there exist polynomials A(x) and B(x) over F such that 
A(x)M (x) + B(x)N(x) = G(x). 


Proof. The proof we give applies only to the greatest common divisor obtained by the 
Euclidean algorithm, which we shall call the Euclidean greatest common divisor. The 
proposition’s validity for all greatest common divisors then follows from Exercises 6.3.7 
and 6.3.8. 

We mimic the inductive proof of Proposition 4.1 and use the notation employed above 
in the description of the Euclidean algorithm for polynomials. Let & be the number of 
divisions in the application of the Euclidean algorithm to M(x) and N(x). If k=1, 
this means that R,(x) is the zero polynomial so that N(x) is a divisor of M(x) and 
the Euclidean greatest common divisor G(x) of M(x) and N(x) is N(x) itself. Thus, 
choosing A(x) =0 and B(x) =1 we get 0-M(x)+1-N(x) = G(x). Assume that the 
theorem holds for all pairs of polynomials for which the Euclidean algorithm requires 
k—1 divisions. If M(x) and N(x) are a pair that require & divisions to arrive at their 
Euclidean greatest common divisor G(x), then the pair M,(x) and N,(x) given by 
Equation 6.13 requires only & —1 divisions to arrive at their Euclidean greatest common 
divisor which is also G(x). By the induction hypothesis, there exist polynomials A’(x) 
and B’(x) such that 

A'(x)M,(x) + B(x) N,(x) = G(x). 


However, if Q(x) and R,(x) are the quotient and remainder obtained when M(x) is 
divided by N(x), then M,(x) = N,(x) and N,(x) = R,(x) = M,(x)— Q,(x)N,(x), and 


116 POLYNOMIALS OVER A FIELD 


so 


A(x)N, (x) + BY(x)[M, (x) — Q (Ni (x)] = G(*), 
BY(x)M(x) + [A(x)— B(x) Q,(x)] N(x) = G(x), 
so that A(x) = B’(x) and B(x) = A’(x)— B’(x)Q ,(x) are the required polynomials. m= 


Consider x4 +.x°+x +1 and x? +x? +x +1 as polynomials in Z,[x]. Then 


Hx tx tla(xtl)i(xttxe tx 41)4(x? +x), 
xigxP tut l=(x4+1)(x? +x)4+(x? +41), 


and x? +x =x(x?+1). Hence, x? +1 is a greatest common divisor of the two given 


polynomials, and 


xe La(xtt xP txt 1 t(x+1)(x? +x) 
=(xt+ x3 4x41) +(x 41)[(x? +22 +x 41) 4 (xt I(x tx3 +x 41) 
= [1+ (x +1) ](x4+x3 +41) +(x +1)(x° +x? +41) 


=xr(xtpxF 4x4 lt (x4 1)(x? +x? +41). 


Irreducible polynomials are the analogs of prime numbers, and just like the integers, 
polynomials have a unique factorization theorem. The formal statement of this theorem 
and its proof are relegated to Exercises 6.3.15 and 6.3.16. 

Two polynomials are said to be relatively prime if their only common divisors are 
constants (i.e., elements of the ground field F). It is clear that such relatively prime 
polynomials have 1 as their greatest common divisor. Consequently we have the following 


proposition which will turn out to be very useful, in a nonobvious way, in Section 7.1. 


Proposition 6.15 If M(x) and N(x) are relatively prime polynomials over a field F, 
then there exist polynomials A(x), B(x) € F [x] such that A(x)M(x)+ B(x)N(x) = 1. 


EXERCISES 117 


Exercises 6.3 


1. Find a greatest common divisor of the polynomials x° + x +1 and x6 +x° +x4+ 


x? +1 over Z,. 


2. Find a greatest common divisor of the polynomials x° + x4 +x +1 and x7 +x4+ 


x? +1 over Z,. 


3. Find a greatest common divisor of the polynomials x° + x4+2x? +x +2 and 


x9 42x44 x? +2 over Zs. 


4. Find a greatest common divisor of the polynomials x° + x° + x44+2 and x7 +.x°+ 


x? +2x3 +x? 42x +41 over Z3. 
5- Repeat Exercise 6.3.1 over Z,. 
6. Repeat Exercise 6.3.3 over Z,. 


7 Let M(x) and N(x) be any two polynomials over an arbitrary field F, and suppose 
D(x) is another such polynomial that divides both M(x) and N(x). Prove that 
D(x) divides the Euclidean greatest common divisor of M(x) and N(x). 


8. Let G(x) be the Euclidean greatest common divisor of M(x), N(x) € F [x], and 
let H(x) be any polynomial over F. Prove that H(x) is a greatest common divisor 
of M(x) and N(x) if and only if there exists a nonzero constant ¢ € F such that 


H(x)=cG(x). 
9. Do there exist polynomials A(x) and B(x) over R such that 


A(x)(x? + 3x +2)+ B(x)(x?-1)=x+2? 
10. Do there exist polynomials A(x) and B(x) over R such that 
A(x)(x? + 3x +2)+ B(x)(x?-1) = x*-—x—2? 
11. Do there exist polynomials A(x) and B(x) over R such that 
A(x)(x? —5x +6)+ B(x)(x?+x—6)=x7 +1? 
12. Do there exist polynomials A(x) and B(x) over R such chat 


A(x)(x? — 4) + B(x)(x? + 2x —8) = x?—2x? 


118 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


POLYNOMIALS OVER A FIELD 


Let M(x) and N(x) be any two polynomials over an arbitrary field F. Characterize 


all the polynomials K that can be expressed in the form 
K(x) = A(x)M (x) + B(x)N(x) 


for some polynomials A(x) and B(x) over F. 

Let M(x) and N(x) be any two polynomials over an arbitrary field F. Of all the 
polynomials that can be expressed in the form A(x)M(x) + B(x)N(x) for some 
A(x), B(x) € F[x], let H(x) be one that possesses minimum degree. Prove that 
H(x) is a greatest common divisor of M(x) and N(x). 

Let P(x), M(x), N(x) be polynomials over F such that P(x) is irreducible and 
P(x) is a divisor of the product M(x)N(x). Prove that P(x) is a divisor of either 
M(x) or N(x). 

Let N(x) be any monic polynomial over the field F. Prove that there exist monic 
irreducible polynomials P,(x), P,(x),...,P,(x%) and positive integers 7), 7>,.... 7, 


such that 


Moreover, if R,(x), R,(x),...,R,(x) is another set of irreducible polynomials, and 


51> $y9+++» Sg is another set of positive integers such that 
N(x) = Ri (x)R}(x)-- REx), 


then 4 =k, and the R,’s be reindexed so that P, = R, and r, =s, for i=1,2,...,4. 
(Hint: see Theorem 4.9.) 
Suppose the polynomials x? + 3px? + 3qx +r and x? +2px +q have a noncon- 


stant greatest common divisor. Show that 
4(p°— 9)(q°— pr)—(pq~7)° =0. 


Prove that the polynomial P(x) ¢ C[x] has a zero of multiplicity at least two if and 
only if (P(x), P’(x)) is not a constant, where P’(x) is the derivative of P(x). 
Prove that if the polynomial ax? + 36x? + 3ex +d €C[x] has a zero of multiplicity 


2, then this zero is 


bc~—ad 
2(ac — 67) 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


EXERCISES 119 


Supply the missing details in the verification of the Euclidean algorithm for poly- 
nomials by proving that any greatest common divisor of R,(x) and R,(x) is also a 


greatest common divisor of M(x) and N(x). 


Let ¢ be a complex primitive 2-th root of unity. Prove that 


n 


| [@-¢*)=2"-1. 


k=1 
Let a be any primitive root modulo p for some prime p. Prove that 


pol 
I [& —a*)=x?!—1 over Z,: 
k=1 


For each positive integer 7 let f, ..0) ,>+++> Sp » be all the complex primitive -th 


roots of unity (m will depend on 7). Prove that the polynomial 
P(x) = I [& —o, 4) 
k=1 


has real rational coefficients for each 7. 


Prove that if P(x) and Q(x) are polynomials over both the fields F c F’, then the 
greatest common divisor of P(x) and Q(x) over F is also their greatest common 


divisor over F’. 

The polynomials x + 1 and 1 are greatest common divisors of x + 1 and x* +1 
over Z, and Z,, respectively. Explain why this does not contradict Exercise 6.3.24. 
Let M(x)=x°+x+1 and N(x) = x°+x°?+x4+x°41 be the polynomials 
of Exercise 6.3.1, and let G(x) be their greatest common divisor over Z,. Find 
polynomials A(x) and B(x) such that G(x) = A(x)M(x) + B(x)N(x) over Z,. 
Let M(x) =x +x4+x+1 and N(x)=x7+x4+x341 be the polynomials 
of Exercise 6.3.2, and let G(x) be their greatest common divisor over Z,. Find 
polynomials A(x) and B(x) such that G(x) = A(x)M(x)+ B(x)N(x) over Z,. 


120 POLYNOMIALS OVER A FIELD 


6.4 Elementary Symmetric Polynomials 


It was already observed in Chapter 1 that there is a close relationship between the coef- 
ficients of a quadratic equation and its roots. Namely, if r and s are the roots of the 
quadratic equation ax” + bx + ¢ = 0, then 

c 


r+s=—~— and rs=-. 
a a 


This will now be generalized to polynomial equations of arbitrary degrees and with 
coefficients in arbitrary fields. First, however, we simplify the statements of the subsequent 
theorems by restricting attention to monic polynomials. Thus, for the monic quadratic 
equation x? + 6x +c =0, we have, by Proposition 1.3, r+s=—6 and rs=c. In 


general, suppose we have a monic polynomial 
P(x)=x" tax" tax"? +--+a, xt, (6.16) 


with coefficients in an arbitrary field F. Suppose further that P(x) has been factored 


into linear factors so that 
P(x) = (x — 17, )(x — 17)++- (x —7,)-+-(x—1,). (6.17) 


Then if all the (x — r,) of Equation 6.17 are multiplied out, and if all like terms are added, 
the right-hand side of Equation 6.16 should be obtained. Before these summands are 


added, there are 2” of them, each having the form 
(-1)'A, A, +++; +A, (6.18) 


where each A, is either x or r,, and & is the number of the A,’s that equal r;. For 


example, there are 7 summands that contain 7—1 x’s, namely 


—1)' rn xxx-+sxx =—r, x7}, 
1 1 
(-l)!xryxx-xx =x", 
(-1)'xx7,x-+-xx =—7,x71, 


(-1)!xxxx---x7, =—1,% 


ELEMENTARY SYMMETRIC POLYNOMIALS 121 


1 


The sum of these terms must agree with the x”~’ term in the right-hand side of 


Equation 6.16 and so we conclude that 


n-] n—-1 
—"x 1x 


n- —1 
— 74x mer PO Sax, 
or 7, +7, +7,+-+++7, =—a,. Similarly, each summand in Equation 6.18 that contains 
n—2 x’s has two of its A,’s equal to the corresponding r,’s, the rest of the A,’s being x, 


and the value of & is 2. Thus, we conclude that 


n~2 n—2 n—2 n—2 
PPX PNT be NTR A IX + 

n-2 n-2 n—2 

PytgX” Wee PN nx Sax,” 
OF TH Ty tet Ty Hyg t Mgt + 1,17, = 4%. The pattern and its justifi- 
cation should now be clear and we only need a definition before this discussion can be 
summarized in a theorem. Let 7,, 7,,..., 7, be a sequence of numbers (in any field), and 
let & be any positive integer 1 < & <n. Then we denote by >> 7,7,--+ 7, the sum of all 


the products of the form r; as? where 1 < i, <i, <--+<i, <n. Thus, 
1 


Sonantytetry 


Sons nt nytt nytt ytyt yt tet y-lly 
> Nyt ly = Ny 


At this point it is convenient to extend the notion of a polynomial to several variables. 
If x, y,z,... are variables, then any function that is obtained by adding, subtracting, or 
multiplying these variables and/or elements of the ground field F is called a polynomial 


over F. Thus, 
33 


197 x9 y7273 + x? — pe OE al 
17 6 
is a polynomial over any field F in which 17, 3, and 2 are all distinct from 0. 
The polynomials $9 7,, 50 7, 7%... 07, %°7' 7, are called the elementary symmetric 


polynomials. The above considerations prove the following theorem. 


122 POLYNOMIALS OVER A FIELD 


Theorem 6.19 Suppose that 


P(x)=x" +a,x""4+a,x"? +---+a, x ta, =(x—r)\(x—1y)---(x—7,). 


Then } 7,7,-+- 7, =(—1)*a, for k=1,2,....” 

This theorem can of course be used to conclude that the sum of the roots of the equation 
x3 +6x +5=0 is 0 (the coefficient of x”) and that their product is (—1)?5 = —5. It can, 
however, be brought to bear on other interesting expressions as well. Let > it reer 
denote the sum of all the distinct monomials obtained by permuting the indices i, 7, &,... 
in the monomial > r7 i r=, Then 30 7 r - rj +++ can be expressed in terms of the 
elementary symmetric ae This baie ae which is known as the Fundamental 
Theorem of Symmetric Polynomials, will not be proved here. 

Instead, some special cases will be considered. The expression 5° ns which denotes 
the sum of the squares of the zeroes of the polynomial P(x) of Equation 6.16 can be 


evaluated, with the help of the Multinomial Theorem, as follows: 


ne ny) -2> 9 yn = af -2-1)"a, =a) — 2a, 


In particular, the sum of the squares of the roots of the equation x? + 6x +5 =0 is 
0? —2-6=—12. Similarly, if r,, 7,, 75, 74 are the roots of the equation x4 + 5x? — 3x? + 
7x +10 =0, then 

1 1 Ty75 7 


TNMs 74 10 
The next proposition provides the general framework for verifying that any proposed set 


of roots does indeed constitute a complete solution set. 


Proposition 6.20 If 7,,7,,..., 7, are elements of the field F and 


ac = 2 
P(k)=x" ax hak ore ea 


is a polynomial over F such that 
DUN Ror, =Cl)a, fork=1,2,....7 


then P(x) =(x—7,)(x—7)---(x—1,). 


ELEMENTARY SYMMETRIC POLYNOMIALS 123 
Proof, Set 
Q(x) =(x— 1 )(x— 1) (x — 7, Hx? tb xP 4 bx F446 xt b,. 
Then, by Theorem 6.19, for & = 1,2,...,, 
b, =(-1) oS nto =(-1)*(-1)£a, = 4. 


Hence P(x) and Q(x) are identical polynomials. . 


We are now in position to eliminate the rough edges of the solution in Chapter 3 of 
the cubic equation. That solution was incomplete in that it was not proved that the three 
roots x,, x,, and x, selected in Equation 3.6 constitute the complete solution set of the 


general cubic equation x? +ax? + bx+c=0. 


Corollary 6.21 The values x,, x,, and x, given by Equation 3.6 are the complete 


solution set of the cubic equation x? +ax* + bx+c=0. 


Proof. Since the reduced cubic equation y? + py + q = 0 was obtained by setting x = 
y — 4/3, it follows from the same Equation 3.6 that it suffices to show that { y,, y,, 73 } is 
the complete solution set of this reduced cubic equation. By Proposition 6.20 it suffices to 
prove that y, + y+ 93 =, I+ IoI3 + Iz) = p» and y, yy; =—q. We demonstrate 
only the last of these equalities, leaving the other two to Exercise 6.4.31. It follows from 


Equation 3.6 that 


p po\( , pe 
Vi I2I3 = 41 32 a ae 7 eae 


2 3,3 
whi Sok SPE SS AB Pp Be te ae ee 
= WS; oe +0° +o og +0 +o 78 
Zz 2 3 3 
== 1040740) 44a? +0)= P page, 
3 92, 272; 272? 


Now, by Equation 3.4, ze is one of the roots of the quadratic equation u* + qu— 
p°/27 =0 and hence the other one is —p?/27z3. Since the sum of the roots of this 


quadratic is ~q, it now follows that 2? - Pp i 2723 =~—gq,and so y, 7, W3=-4- a 


124 POLYNOMIALS OVER A FIELD 


Exercises 6.4 


Let r,s, and ¢ be the roots of the equation x* + ax” + 6x +¢ =0. Rewrite the expres- 
sions in Exercises 6.4.1 to 6.4.7 in terms of a, 6, and c. (Wherever necessary, you may 


assume that the denominators are not zero.) 

ores? te? 5. 1fr+1/s+i/t 
2 (rtsPe(rttyPra(ste? 
6. Afr? +1/s?+1/e? 

3. (rts\rtety(st+e) 

40st hr tr tse? 7 If(rts)+if(r+t)tif(stt) 


Let 7, s, and ¢ be the zeroes of the real polynomial x? + 23x +1. Find the real cubic 
polynomial whose zeroes are those that appear in Exercises 6.4.8 to 6.4.13. 
8. 7,57, t? 10. rs, st, tr 12. kr, ks, kt 


9. rts,st+t,ttr ir. l+r,1lt+s,lt+¢ 13. k/r,k/s, k/t 


Let 7,, 7), 75, and 7, be the zeroes of the polynomial x‘ + 3x? — 5x +1. Evaluate the 


expressions in Exercises 6.4.14 to 6.4.16. 

4. eee 15. 1m 41/ry + 1/41/14 

16. ("%—-7; ny +(43- 7% ad A zy 

17. Solve the equation x* — 4x4 42x? — 8x? — 35x + 140 =0 whose solutions have 
the form r, —r, 5, —s, ¢. 

18. The equation x° + 3x4 — x? — 11x? — 17x —5 =0 has two roots whose product is 
1. Determine these two roots. 

19. The equation x° — 409x + 285 = 0 has two roots whose sum equals 5. Determine 
these two roots. 


: n n—1 n—2 

Let 7), 7%, 73... 7, be the zeroes of the polynomial x” +a,x""" +a)x” "4-044, )x+ 
a,,. Prove the identities in Exercises 6.4.20 to 6.4.25. 

20. or) 1, =3a,— 4,4, 22. orp 1, = 4,4, —4a, 

21. dor? =3a,a,—a? —3a 23. S rr} =a3 —2a,a,+ 6a 

: pea 4) 3 1d Sod 143 4 
Dash ans 1A 2 4 
24. Si rj ry = 4/4, — 2a, — 4,4, — 4a, 


25. Sor! =a} —4aza, + 2a; + 4a,a, — 4a, 


LAGRANGE'S SOLUTION OF THE QUARTIC EQUATION 125 


2 


26. Let 1,7 ....7, be the zeroes of the polynomial x” + a,x") + a,x"? +--+ + 


a, _,x+a, where a, #0. Prove that the zeroes of 
a 
Ja yn 4 tre 2 


are 17 ris If tyyweny A he 


n 


Let 1, Py 73»--+» 7, be the zeroes of the polynomial x” +.4,%""|+a,x" 7 4+---+4,_)x+ 


a, where a, #0. Express the sums in Exercises 6.4.27 to 6.4.30 in terms of a, 4),...,4,. 
27. Yo1/r, as. Saye 29- So1/r,r; 30. 1/r? 


31. Complete the proof of Corollary 6.21. 


6.5  Lagrange's Solution of the Quartic Equation 


In 1771 Lagrange wrote a lengthy treatise titled Reflexions sur la Résolution Algébrique des 
Equations in which he summarized what was known about solvability of equations by 
radicals. He also added some thoughts of his own and in fact proved several theorems that 
eventually did lead to the resolution of this issue by the next generation of mathematicians. 
It is with this contribution of Lagrange’s as well as some of its subsequent developments 
that most of the rest of this book is concerned. 

One of the methods that Lagrange offered for the solution of quartic equations began 
with the seemingly innocuous observation that when the roots 7,, 7,, 7,, and 7, are 
permuted (in other words, substituted for each other), the expression r, 7, + 7,74 assumes 
only three values, namely, itself, 7,7, + 7,74, and ry 74+ 773. 

For example, when the variables are interchanged by cycling them to the left, 7, 7, + 
1,1, becomes 


Tyg HG Ty = 11g t y75- 
If, on the other hand, only r, and r, are switched, the polynomial is transformed into 
WGN 13%, = 773+ 04. 


This fact can be used to solve the quartic in the following manner. Let Tr Tas Tp 
and r, denote the four roots of the equation x4 + ax? + 6x? +cx +d =0 and set 


A=rnrnt13%,B=7,7,+1,7,,and C=r,7r,4+7,7,. Clearly, A+B+C=>> 7,7, = 6. 


126 POLYNOMIALS OVER A FIELD 


Next, 


AB+AC+BC= 
(Ant anntnnt(int any t ny) t+t(4t anlant ny) 
7 Fe hs = OO n> nr 173)-4(>> r Rit) = (—a)(—c) — 4d = ac —4d. 


Finally, 


ABC = (ry 1, + 13171, 15+ 174) 14 + 1275) 


3 222 
‘ 1y13%g 1 ly rs 


Il 


2 2 2-2 
Ty 73750 +(7, 1513) —2r; 15 5%, 


d [(7)?-27, r] +(-cP — 24 rr 1477 
= d(a* —2b)+c?—2db 
=a'd+c*—4bd. 


These computations lead to the observation that if 7,, r,, 73, 7, are the roots of the 
quartic equation 


xt +axr+bxrtext+d =0, 


then Mt 1304s M134 M74 and r+ 7, r, are the roots of the cubic equation 
3 2 2 2 us 
y? — by” + (ac —4d)y —(a°d + c* —4bd) =0. 


Since every cubic equation is already known to be solvable by radicals, it follows that 
A, B, and C have algebraic expressions in a, 6, c, and d. The actual values of 7,, 
r, 7, and 7, are now extracted with relative ease. By Exercise 6.5.8 we may assume 
that 7,7, # 737,. Since r, 7, + 7,74 =A and (1, 7,)(7,74) =, it follows that a= r, 7 
and B = 1,7, are the solutions of the quadratic z* — Az + d = 0, and so they too have 


algebraic expressions in a, 6, c, and d. Moreover, (7, + 7.) +(7; + 74) =—a and 


131(t +) + 7 (75 +4) = 47 =—C- 


EXERCISES 127 


When this system of simultaneous equations is solved for (7, + 7,) and (7, + 74) we get 


¢—ar, io) _ c¢—aa 


Ktn= = 


TT) — 7314 a—B 


and F 
47,%,—C ap—c 
pe ae a 
1 — 1375 a—B 


Note that y and 6 both have algebraic expressions in a, 6, c, and d. Finally, since 
r +7, =y and 7,7, =@, it follows that r, and r, are the solutions of the quadratic 
u? —~yu+a=0, and similarly r, and 7, are the solutions of the quadratic v? —3v +B =0. 


Thus we have proved the following theorem. 


Theorem 6.22 Every fourth-degree equation is solvable by radicals. 


We illustrate Lagrange’s method with an equation whose solutions can, of course, be 


4_ y? =. Here, the auxiliary 


found in a much shorter way. Consider the equation x 
cubic turns out to be y? + y” = 0 and we choose A =—1 (usually the choice is arbitrary, 
though in this case choosing A = 0 would lead to problems). This gives us the auxiliary 
quadratic z? + z =0 and we set a=—1 and B=0. This gives us y = 5 = 0 and so the 
roots of the original quartic are those of the quadratics u*— 1 = 0 and v? =0, namely, 


+1,0,0. 
Exercises 6.5 


Use Lagrange’s method to solve the equations in Exercises 6.5.1 to 6.5.4. 


1 x4-1=0 2 x44+1=0 3. xt—x=0 4. x*+x=0 


Explain why the equations in Exercises 6.5.5 to 6.5.7 are resolvable by radicals. 

§- x8 — 2x7 43x6—5x? + 1x4 —5x9 + 3x7—-2e +1=0 

6. x®—~3x°4+5x4—2x?+17=0 

7 x25 +5x°—32=0 

8. Suppose the roots of x4 +.ax? + bx? +.cx +d =0 are such that the product of 


any two equals the product of the other two. Prove that x4 +.ax?+ bx? +ex+d 


factors into either (x + r)* or (x? — r*)? for some complex number r. 


128 POLYNOMIALS OVER A FIELD 


Chapter Summary 


The notion of a field was isolated and identified as the underlying structure common 
to the rational, real, complex, and prime modulus number systems. Polynomials were 
studied in this abstract context, and the correspondence between their zeroes and their 
linear factors was pointed out. The Euclidean algorithm was demonstrated to contain 
much useful information about polynomials. Some attention was paid to the relation 
between the coefficients of a polynomial and its zeroes. It was also shown that information 


of this type can be used to yield a formula for the solution of the general quartic equation. 
Chapter Review Exercises 


Mark the following true or false. 
1. The remainder of x!” +1 when divided by x7 +1 is x? +1. 
2. In R[x], the remainder of x7? + 1 when divided by x + 2 is 4,321. 
3- In Z,[x], x4 —x is divisible by x? —3x +2. 
4. x* +1 is reducible over Z,. 


§- x? +1 is reducible over every field. 


6. The equation 5x4 +(1+i)x? —ix +17 =0 has four complex roots. 
7- The greatest common divisor of x? + 1 and x? —ix over C is x —i. 
8. The product of all the complex roots of the equation x? —3x +5 =0 is —5. 
9. The quartic equation is solvable by radicals. 
New Terms 
constant polynomial, 103 ground field, 102 
degree, 103 irreducible, 108 
division of polynomials, 103 Lagrange’s method, 127 
divisors, 108 monic polynomial, 103 
elementary symmetric polynomials, 121 multiplicity of a zero, 110 
Euclidean greatest common divisor, 115 polynomial over a field, 101 
factorable, 108 reducible, 108 
factorization, 108 relatively prime, 116 
factors, 108 variables, ror 
field, 99 zero of a polynomial, 107 


greatest common divisor, 113 zero polynomial, 103 


CHAPTER SUMMARY 129 


Supplementary Exercises 


Write a computer script that finds the greatest common divisor of any two polyno- 


mials with real coefficients. 


Write a computer script that finds the greatest common divisor of any two polyno- 


mials with coefficients in Z,. 


Write a computer script that finds the greatest common divisor of any two polyno- 


mials with coefficients in Z,. 


Write a computer script that implements Lagrange’s solution of the general quartic 


with complex coefficients. 


Write a computer script that lists the monic irreducible polynomials of degree d 


over Z, ‘ 


Find a formula for the number irreducible monic polynomials of degree d over 


Z,. 


Prove that every symmetric polynomial in the variables x,, x,,..., x, is expressible 


as a polynomial in the elementary symmetric polynomials 


Myr My Xq2 000 MQ X,. 


Chapter 7 


GALOIS FIELDS 


OME NEW FIELDS are introduced and studied in detail. These fields combine some of 
S the features of both the complex and the modular numbers systems. The existence 


of primitive roots modulo p is proved in this new context. 


7.1  Galois's Construction of His Fields 


The following quotation consists of the opening paragraphs of the article On the Theory 
of Numbers by Evariste Galois, which appeared in the June 1830 issue of the Bulletin 
des Sciences mathématiques. Some of the notation has been modernized for pedagogical 
reasons and a more faithful translation appears in Appendix D. 


When it is agreed to consider as zero all the quantities which are the multiples of a given 
prime number p, and, subject to this convention, one looks for solutions to the polynomial 
equation F(x) =0, i.e., the equations that Mr. Gauss denotes by F(x) = 0, it is customary to 
consider only integer solutions to these sorts of questions. Having been led by some specific 
researches to consider their irrational solutions, I have arrived at some results that I consider 


to be new. 
Let there be given such an equation or congruence, F(x) =0, and let p be the modulus. 
Suppose first that the congruence in question admits no rational factors, that is, there exist 


no three polynomials g(x), p(x), x(x) such that 


Ax) Wx) = F(x) + p x(x). 


In that case the congruence has no integer roots, nor any factor of smaller degree. One should 
therefore regard the roots of this congruence as some kind of imaginary symbols (since they do 
not satisfy the same questions as integers), symbols whose employment, in calculations, will 
often prove as useful as that of the imaginary ¥—1 in ordinary analysis. We are concerned 
here with the classification of these imaginaries and the minimization of their number. Let i 


denote one of the roots of the congruence F(x) = 0, which can be supposed to have degree v. 


Introductory Modern Algebra, Second Edition. 131 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


132 GALOIS FIELDS 


Consider the general expression 
: 2 eye] 
4, +aitaivt+---+a_,i™, (A) 


where a, 4,,4>,...,4,_, tepresent integers. When these numbers are assigned all their possible 
values, Expression A runs through p” values which possess, as I shall demonstrate, the same 


properties as the natural numbers in the theory of residues of powers. 


In the first paragraph Galois states that it is his intention to consider the solutions of 
polynomial equations with coefficients in Z,- He then goes on to explain what is meant 
by irreducibility of polynomials modulo p, a topic that was covered in Section 6.2. 
‘The closing sentence of the second paragraph is extraordinarily creative and imaginative. 
Just as the polynomial x? + 1, which is irreducible over the real numbers, yields the 
imaginary but useful number /—1, so do these polynomials which are irreducible over 
Z, yield a new species of imaginary symbols. Accordingly, we shall refer to these as Galois 
imaginaries. 

Galois next proceeds to draw some further consequences from this analogy. It was seen 


in Section 2.1 that if Y/—1 is a zero of the irreducible quadratic x? + 1, i.e., if 
(Vv-1?+1=0 or (¥-1?=-1, (7.1) 


then any rational function of —1 can be reduced to the form a+ 6V—1. Galois now 
asserts that if i is the new imaginary number associated with an irreducible polynomial 


F(x) of degree v over Z,, then every rational function of i can be reduced to the form 


a taitai+---ta_i! (7.2) 

Let us digress here into some concrete computations. Consider the irreducible polyno- 
mial x7 +x +1 over Z,, and suppose it has as a Galois imaginary, so that, in analogy 
with Equation 7.1, @ satisfies the equation e+at1=0 over Z,, or oe =1+a. Con- 
sequently, any second-degree polynomial in a can be reduced to a linear function of «. 
For example, 


e+l=lt+atl=2+a=0+a=a. 


The same holds for cubic polynomials, since 


e=0a=(ltaja=ate=atl+a=14+2e=14+00=1. 


GALOIS’S CONSTRUCTION OF HIS FIELDS 133 


5 = gta =aa=a* =1+4, and so on. In other words, 


Similarly, a4 = c'a=la=a,a 
the successive powers of a cycle through the values 1, a, and 1+, and hence every 


polynomial function of « can be reduced to the form 
a@jta,a with a,a,€Z,. (7.3) 


The preceding considerations make it clear that the sum and product of any two of the 
expressions 0, 1, &, a+ 1 is again an expression of the same form. Let us now examine the 
issue of division. Does each of the nonzero elements of the form in Expression 7.3 have 


an inverse? Since the coefficients a, and a, can assume only the values 0 or 1, there are 


° a=a',and 1+a=a@’. As itis already 


known that a? = 1, it follows that 1“! = 1 (of course), a7! = a, and (a)! =a! =a. 


only three nonzero elements to consider: 1 = a 


Thus the elements 0, 1, a, and 1+. form a field provided that 0 and 1 are understood 
to be elements of Z,, and provided it is assumed that a? +a+1=0. With the sole 
exception of the issue of the existence of multiplicative inverses, the validity of the field 
properties in this new context follows from their validity for polynomials, since 1+ & is 
treated much the same as the polynomial 1 + x, etc. 

Let us consider another example in detail before we comment on Galois’s brainchild in 
general. The polynomial x? + x? + 1 is irreducible of degree 3 over Z,. Let 6 be a Galois 
imaginary associated with this polynomial. I.e., 8 is a number such that 6° + 6? + 1=0, 
or 8 = 148”, over Z,. Then 


BG =P B=(14+6?)B=B+H =B+14+ HK =1+8+—", 
@=—B=(1+8+8)B=B+8' +8 =B+B+1+8 =1+8+26’ =1+8, 
6 =pB=(1+A)B=B+B, 
8 =6B=(6+ 8 )B=P +H =P +148 =14+2¢=1. 

It is again clear that every polynomial function of is reducible to the form 


a +a,B+ ap, 


where each of the coefficients a, a, , and a, can assume the values 0 or 1. There are 


exactly 2? = 8 such values, and, as seen above, these eight values can also be listed as 


134 GALOIS FIELDS 
0, 1,8, 8, 8°, 64, B°, B°. Since the fact that B’ = 1 implies that 
(e')! =p’-* for all = 0,1,2,3,4,5,6, 


it follows that this set of eight Galois imaginaries constitutes a field. 

It turns out that the set of elements of the form of Expression 7.2 generated by a Galois 
imaginary is always a field, a fact that will be proved shortly. That such a set is closed 
with respect to addition, subtraction, and multiplication is quite clear. The existence 
of multiplicative inverses, however, is another, less obvious, matter. The polynomial 
x44x3+4x? +x +41 is irreducible over Z, and so it has a Galois imaginary y associated 
with it, such that y#=1+y+y?+y7° over Z,. The form of Expression 7.2 gives rise to 
24 = 16 associated elements. However, if we now proceed to list the powers of y, as was 


done earlier for « and B, we encounter a difficulty, for 
Pervya(ltrtrtyPartr tr trartr tr titrtyr teal. 


For the first time the successive powers of the Galois imaginary have failed to cycle 
through all the elements of the form of Expression 7.2. Consequently, the device used 
in the previous examples to identify the inverse of each nonzero element is not available 
here. Nevertheless, the Galois imaginary y does generate a corresponding field. To 
demonstrate this it is only necessary to show that when the symbol i of Expression 7.2 
is replaced by y, or by any Galois imaginary, then each nonzero element of the form of 
Expression 7.2 does indeed have a multiplicative inverse of the same form. The same 
Euclidean algorithm that was used to prove the existence of inverses in Z, works in this 


new context as well. 


Lemma 7.4 _ Let P(x) be an irreducible polynomial of degree v over Z,» and let 5 be 


the associated Galois imaginary. For each element 
_ = 2 v—] ae, 
S=4a,+4,6+4,8 +---+4,_,8"", a, €Z,,i =0,1,2,3,..., 


if the coefficients @,a,,...,@,_, are not all zero, then there exists an element 7 of the 


same form such that 7f = 1. 


Proof. With f as above, define 


Q(x) =a) ta,xt+anx? teeta, x7 


GALOIS’S CONSTRUCTION OF HIS FIELDS 135 


Since P(x) is an irreducible polynomial of degree v and Q(x) is a nonzero polynomial 
of degree less than v, it follows that P(x) and Q(x) have 1 as a greatest common divisor. 


Hence, by Proposition 6.15 there exist polynomials A(x) and B(x) such that 
A(x)Q(x)+ B(x)P(x)=1 over Z,. 


Consequently, A(5)Q(3) + B(d)P(8) = 1 over Z,. However, by the definition of 8, 
P(8)=0, and by the definition of Q(x), Q(6) =f. Hence, A(8)f = 1 over Z, , and so 


we can choose 7 = A(8). 7 


If P(x) is any irreducible polynomial over Z 15 and i is a corresponding Galois imaginary, 
then the set of all the elements of the form of Expression 7.2 is denoted by GF(p, P(x)) 
and is called a Galois field. Thus, the three Galois fields described above are GF(2, x? + x + 
1), GF(2, x? +x? +x +1), and GF(2, x44+x3 +x? 4x41). It follows from Lemma 7.4 
that every Galois field is indeed a field in the sense of Section 6.1, and we state this 
explicitly. 

Theorem 7.5 If P(x) is an irreducible polynomial over Z,, then the Galois field 
GF( p, P(x)) is a field. 


Just as the polynomial x? + 1, which is irreducible over R, has the two imaginaries i 
and —i associated with it, so is it the case that every degree v polynomial P(x) that is 
irreducible over Z, has y distinct Galois imaginaries. However, it so happens that the 
Galois field generated by any one of these imaginaries of P(x) contains all the other 
v—1 imaginaries of P(x), and so all of the Galois imaginaries of P(x) define the same 
Galois field GF(p, P(x)). A more detailed discussion of this phenomenon will be found 
in Section 7.4 below. 

Galois does not prove Theorem 7.5 explicitly. Later in his paper he writes 

Next it can be proven, just as is done in the theory of numbers, that there exist primitive roots 

a... which ... reproduce, by their powers, the complete sequence of all the other roots. 

In other words, Galois claims that the Galois field GF(p, P(x)) contains an element 
a whose powers 1, a, a, °,... run through all the nonzero values of Expression 7.2 
above. Because of the similarity that this bears to the powers of the complex roots of 
unity, such an element is also called primitive. The examples preceding Lemma 7.4 show 
how such a primitive element can be used to demonstrate the existence of multiplicative 
inverses. A table that expresses all the powers of a primitive Galois imaginary in the form 


of Expression 7.2 is called a cyclic table. Table 7.1 contains the cyclic table of the primitive 


136 GALOIS FIELDS 


p= B=1+B+e 


B' =8 fp =1+8 
B=p Bo =B+8 
B=1+Be p= 


Table 7.1 The cyclic table of GF(2, x? + x? + 1) 


Galois imaginary f associated with GF(2, x? + x? +1). The detailed calculations were 
displayed above. 

The elements a and f of this chapter’s first two examples are primitive elements of 
GF(2, x7 +x +1) and GF(2, x? + x? + 1) respectively. On the other hand, the element 
y fails to be a primitive element of GF(2, x4 + x? + x7 +. +1). The order o(f) of any 
element & of some Galois field is the least positive integer & such that c* =1. Thus, in 
the aforementioned Galois field y has order 5. On the other hand, in GF(2, x? + x +1), 
the elements « and «+ 1 both have order 3. Similarly, with the exception of 0 and 1, 
each of the elements of GF(2, x? + x” + 1) has order 7. As it would be pedagogically 
useful to have at least one Galois field that is not associated with a polynomial over Z,, we 
compute the cyclic table of GF(3, x? + x +2). That this polynomial is indeed irreducible 


over Z, follows from the facts that 


0?+0+2=2#0, 


174+14+2=4#0 
and 
274+2+2=8#0 (mod 3). 
If o is the associated Galois imaginary, then 
o +o+2=0 


over Z,, and so 


a =-2-0=1420. 


GALOIS’S CONSTRUCTION OF HIS FIELDS 137 
Consequently 


o =0'o =(1+20)0 =04+ 20? =9 4+ 2(1420)=24+50=2+428, 


o! =0°o = (2+ 2e)0 = 20 + 20? = 26 +: 2(1 + 20) = 24+ 60 =2, 


o=o'c= 20, 


0 = 0a = 20° =2+40=2+0, 
a’ =0°a =(2+0)0 =20+0* =264+1420=1+0, 


o& =0/a=(lto)o=0t0 =0+14+20=1. 


It follows that o is indeed a primitive element of GF(3, x? + x +2). 

The following proposition provides us with some general information about the number 
of elements of a Galois field and their orders. 
Proposition 7.6 Let F be a Galois field associated with the irreducible polynomial 
P(x) of degree v over Z,- Then F has exactly p” elements and the order of each nonzero 
element of F is finite. 
Proof. Let i be the Galois imaginary associated with P(x). As was remarked above, every 
element of F is expressible in the form of Expression 7.2. Hence F contains at most p” 
elements. In order to show that F contains exactly that number of elements, it suffices 
to show that no two distinct expressions of the form of Expression 7.2 can equal each 


other. Suppose Z, and @, are distinct elements of F. Then 
0, =4% +aita,i’ + vtaji 
and 
l= byt bit bP +--+b 0 


where a, # 6, for some k =0,1,...,y—1. Then, when Lemma 7.4 is applied to the 


element 
$=0,-6,=(q—&)+(4,-4)i+ (@—4)? +---+(4_,- 4,271. 


we conclude that there is an element 7 such that 7 = 1. By Proposition 6.1, & is not 
zero and so [, #%,. Thus F has exactly p” elements. 
Let a be any nonzero element of F. Since all the terms of the infinite sequence 


1,a,@7,a°,... are elements of F, and since F contains only a finite number of elements, 


138 GALOIS FIELDS 


it follows that there exist two distinct exponents m > n such that o” =o”. But then 


a” = 1, andso o(a) < m—n. Thus the order of each nonzero element of F is finite. m= 


Exercises 7.1 


1. Write out the cyclic table of GF(2, x? + x +1). 
2. Write out the cyclic table of GF(2, x4 +x +1). 


Verify that the Galois imaginary associated with each of the polynomials in Exercises 7.1.3 


and 7.1.4 is primitive over Z, and write out the associated cyclic table. 


3. x? +x741 4 xP 4x34] 


Verify that the Galois imaginary associated with each of the polynomials in Exercises 7.1.5 


to 7.1.7 is primitive over Z, and write out the associated cyclic table. 


5. x? +2x4+2 6. x +2x41 7 xP 42x27 41 


8. Verify that the Galois imaginary associated with x* + 4x +2 is primitive over Z, 
and construct its cyclic table. 
9. Verify that the Galois imaginary associated with x? + 6x + 3 is primitive over Z, 


and construct its cyclic table. 


Let B be the Galois imaginary associated with the irreducible polynomial x? + x? + 1 
over Z,. Solve the (systems of simultaneous) equations in Exercises 7.1.10 to 7.1.13 in 
GF(2, x3 + x? +1). 
10. (1+8)x+6=1+ 8° Il. x+y=Phandx+fy=1 
12, x+(1+f)y=B+ 6" and (1+6’?)x+y =0 
13. x+By+f'°z=146;(1+P)x+(14+f’)y +z =1;and By +z=8 
14. Solve Exercise 7.1.10 with GF(2, x? + x? + 1) replaced by GF(3, x? + 2x +2). 
15. Solve Exercise 7.1.11 with GF(2, x? + x? +1) replaced by GF(3, x? + 2x +2). 
16. Solve Exercise 7.1.12 with GF(2, x? + x? + 1) replaced by GF(3, x” + 2x +2). 
17. Solve Exercise 7.1.13 with GF(2, x? + x? + 1) replaced by GF(3, x* + 2x +2). 
18, Explain why GF(p, x — 1) =Z,. 
19. Explain why the Binomial Theorem holds for elements of Galois fields. 
20. Prove that (4+ 6)? =a? + b? for any a, be GF(p, P(x)) . 


THE GALOIS POLYNOMIAL 139 


21. Prove that GF(p, P(x)) contains exactly one p-th root of unity. 

22. Prove that for any a, 6 ¢ GF(p, P(x)), a? = 6? ifand only if a= 6. 

23. Prove that for any positive integer & and for any a, b € GF(p, P(x)), a?’ = bP if 
and only if a= 6. 

24. True or false: for any a, 6 € GF(p, P(x)), a? = 6? if and only if a = 6. Justify your 
answer. 

25. Show that for any ae GF(p, P(x)), a? =a if and only if ae Z,. 

26, Let & be any fixed positive integer. Show that {a ¢ GF(p, P(x))|a? ‘= 2} is also 
a field. 

27. Prove that if p # 2, then the sum of all the elements of GF(p, P(x)) is 0. 

28. Prove that the product of all the nonzero elements of GF(p, P(x)) is —1. 


29. Let a be an element of the Galois field GF(p, P(x)). Prove that 


(a) ltata?+---+a! =0. (b) 1-a-a?--- al! = (-1)1, 


7.2 The Galois Polynomial 


The orders of the elements of Galois fields, defined in the previous section, possess the 
same properties as the orders of the complex and modular roots of unity, which are restated 
here for the sake of completeness. Since the proofs of Proposition 2.16, Corollary 2.17, 
and Propositions 5.19 and 5.20 work in the new context verbatim, these properties are 


restated without proof. 


Proposition 7.7 Let a and § be any roots of unity in some field F. Then 
(a) &” =1 if and only if 7 is a multiple of o(a); 
(b) a =a? if and only if o(a) is a divisor of a— 6 and so 1, a, *,...,a%! are all 
distinct; 
(c) if o() =n, then o(a*) = n/(k, n); 
(d) o(aB) = o(a) o(B) if o(«) and o() are relatively prime. 


If a is any element of order &, it must clearly be a zero of the polynomial x” — 1 
whenever 7 is a multiple of &. Hence, by the first part of the above proposition, if e is 
the least common multiple of the orders of all the nonzero elements of the Galois field 
GF(p, P(x)), then these elements are all zeroes of x“ — 1. This number e¢ is, of course, of 


interest, and it will eventually be demonstrated (Theorem 7.17) that e = p” — 1, where v 


140 GALOIS FIELDS 


is the degree of P(x). We begin this process by picking up where the previous section’s 


quotation from Galois’s paper left off. 


Of the expressions [in Expression A] we shall only take the p”—1 values obtained when 


p41, 4>)...,@,_, are not all zero; let « be one of these expressions. 


If a is successively raised to the second, third, ...powers, a sequence of quantities all of which 
have the same form is obtained (since every function of i is reducible to the (v—1)-th degree). 
Hence it must be that «” = 1 for some 7; let n be the smallest number such that a” = 1. 
Then the numbers 1, a, &,0°,...,a”"! are all distinct. Next, multiply these » numbers by 
another expression C of the same form. We then obtain another new group of quantities all 
different from the first group as well as from each other. If the quantities of Form 7.2 have 
not been exhausted yet, the powers of a can be multiplied by a new expression y, and so on. 
Consequently the number 7 necessarily divides the total number of quantities of Form 7.2. 


Since this number is p” — 1, we see that 7 divides p” — 1. From this it also follows that 


aw l=1, of a? =a. 


Two sentences later we find the following statement: 


We note here the remarkable result that all the algebraic quantities that arise in this theory 


are roots of equations of the form 


y 


xP =x, 


To illustrate the above procedure, consider the Galois imaginary y associated with the 
polynomial x4 +x? +x? +x +1 which is irreducible over Z,. We saw that y? = 1. Since 


yi=lt+y+y’+y’, the process begins with 
Ly, ltyty +r’. (7.8) 


According to Proposition 7.6, any two elements of Expression A are distinct, and hence 
1+y is different from all the elements listed in Equation 7.8. Using this 1+y as C, the 


next set of elements produced by Galois’s procedure is 
(+y)L (Ity)n (ty), tr), (lt rtrt+y’ +7?) 
or, upon simplification, 


l+yrty, Ptr, ltrty, vty tr’. (7.9) 


THE GALOIS POLYNOMIAL 141 


The element 1+ 7 has not been listed yet, so the next list is 
L477, (L+y?r, (l+7?)y, (lt? l4 rd trty? tr) 
or, upon simplification, 
l+72,rt+7% l4+rtr, lty?, 147747’. (7.10) 


It is easily verified by inspection that, as Galois claims, the three sets listed in Equations 7.8, 
7.9, and 7.10 exhaust all the nonzero elements of GF(2, x4 + x3 +x? +x +1). We now 
state Galois’s theorem and supply some of the missing details in his proof. Another, 
much more general and succinct proof will be provided later by Proposition 9.16 and 


Exercise 9.5.12. 


Theorem 7.11 (Galois) Let P(x) be an irreducible polynomial of degree v over Z,. 


Then all the elements of GF(p, P(x)) are zeroes of the polynomial x?" — x. 


Proof. Let a be any nonzero element of F = GF(p, P(x)) and suppose that it has order 
n. By Proposition 7.7, the list 


n—) 


La,o,...,@ (7.12) 


consists of 2 distinct elements. If this list does not exhaust all the nonzero elements of 


F,, let B be a nonzero element that does not appear in this list, and consider the new list 


BoB, 0B,...,07- 1B, (7.13) 


All the elements of List 7.13 are distinct from each other since otherwise some two 
elements of List 7.12 would also be nondistinct. Moreover, List 7.12 and List 7.13 are 
also disjoint since otherwise we would have, for some integers k and m, at B =a”, 
implying that is a power of a, which we know not to be the case. If List 7.12 and 
List 7.13 do not exhaust the field, we choose an element y that is in neither list and 
repeat the process. Since the field F is known to be finite, this process must eventually 
terminate and leave the nonzero elements of the field partitioned into disjoint lists each 
of which contains exactly n elements. It follows that 7 is a divisor of the total number of 


nonzero elements of F , which, by Proposition 7.6, is p” — 1. Hence, by Proposition 7.7, 


a1 =], (7.14) 


142 GALOIS FIELDS 


Thus «, an arbitrary nonzero element of F, is a zero of the polynomial x?~!—1. 
Consequently, each of the elements of F, 0 included, is a zero of the polynomial 


x(x?! —1)= x?" — x, : 


As Galois notes, this is a remarkable fact, for it provides us with a single, very simple 
polynomial, x?’ — x, that contains all the elements of GF(p, P(x)) as its zeroes. We shall 
refer to this polynomial as the Galois polynomial of GF(p, P(x)). The observation that 
the order of each nonzero element of F divides p” —1 implies that the least common 
multiple of all the orders, denoted by e in this section’s opening paragraph, divides 
p-i. 

It should be of interest to examine the case v= 1. The irreducible polynomials over 
Z, of degree v = 1 are of course the binomials x —a where a < Z,. Now the Galois 
imaginary associated with x —a is none other than the known quantity a, so that 

GF(p, x ~a) = Z,. 

Here the Galois polynomial is simply x? — x and Theorem 7.11 reduces to Fermat’s 
Theorem 5.15. 

The correspondence between the linear factors of a polynomial and its zeroes yields 


the following result. 


Corollary 7.15 If P(x) is an irreducible polynomial of degree v over Z,, then 


” 


xP —x =| [(«-a,) 


i=l 


where @,,,..-,@y» is any listing of the elements of GF(p, P(x)). 


Proof. By Theorem 7.11 the polynomial x?” —x has the p” distinct elements of the 
field GF( p, P(x)) as its zeroes. By Proposition 6.8, the polynomial x?” — x cannot have 


any more zeroes. Thus, @,,@,...,@,» constitute all the zeroes of the Galois polynomial 


P 
x? — x, The statement of the corollary now follows from Corollary 6.7. a 


EXERCISES 143 


If « is the Galois imaginary of the polynomial x* + x +1 over Z,, then the elements of 
GF(2, x? +x +1) are 0, 1,a,1+« and 


(x —0)(x — 1)(x —a)(x —(1+@)) = [x(x + I] [(x + o)(x+14+2)] 
= (x7 +x)\(x? +x +07? +a) =(x?4+x)\(x?+x+41) 


2 
Haxttxr tx trax teaxtt eax? cea’, 


Exercises 7.2 


1. Find the orders of all the nonzero elements of GF(2, x? + x? + 1). 
2. Find the orders of all the nonzero elements of GF(2, x4 + x +1). 
3- Find the orders of all the nonzero elements of GF(2, x° + x? +1), 
4. Find the orders of all the nonzero elements of GF(3, x” + 2x + 2). 
5- Find the orders of all the nonzero elements of GF(5, x” + 4x + 2). 
Find the orders of all the nonzero elements of GF(7, x? + 6x + 3). 


6 

7. Verify Corollary 7.15 directly for GF(2, x? + x? + 1). 
8. Verify Corollary 7.15 directly for GF(3, x? + 2x +2). 
9 


Show that if GF(p, P(x)) is a Galois field and a is any of its elements, then 


| 
aitete tet p 


¢ Z,, where v is the degree of P(x). 


Let F be the Galois field GF(p, P(x)) for Exercises 7.2.10 to 7.2.13. 

10. Show that the sum of all the elements of F is either 0 or 1. 

11. Use the Galois polynomial to prove that the product of all the nonzero elements of 
F is —1. 

12. What is the sum of the squares of the elements of F? 

13. Evaluate the sum of the reciprocals of all the nonzero elements of F. 

14. Let p bea prime number and let 7 be relatively prime to p” — 1. Prove that there 
is exactly one n-th root of unity in GF(p, P(x)). 

15. Suppose 4, 6 € GF(2, P(x)) for some degree 9 polynomial P(x) that is irreducible 


over Z,. Suppose further that a” +. ab + 6* =0. Prove that a= 6. 


144 GALOIS FIELDS 


7.3. The Primitive Element Theorem 


Toward the end of his paper Galois lets on that his purpose in constructing these new 
number systems was to find new contexts within which primitive roots exist and to 
which Gauss’s techniques, which proved so effective for the algebraic resolution of the 
cyclotomic equation (see Section 2.4), could be applied to produce new algebraically 
resolvable equations. Galois does not prove the existence of these primitive elements, 
contenting himself with a comment to the effect that Gauss’s proof of the existence of 
primitive roots modulo p carries over intact to this new setting. We will not follow 
Gauss’s proof here and give instead a more modern, and somewhat shorter, proof. 

Lemma 7.16 If F is a Galois field with f elements, and if q” is the largest power of 


the prime number g that divides f —1, then F contains an element a of order q”. 


Proof. The polynomial x‘f—)/4 — 1 has degree (f —1)/q < f —1, and so it follows 
from Proposition 6.8 that there is a nonzero element 6 € F which is not a zero of this 
polynomial, ie., 6-9/9 £1. Set a= b'f-9/9" | Then, 


ae” = blf-v/4 #1 


while, by Proposition 7.6 and Theorem 7.11, 


al” =bF-) =], 


Thus, o(a) divides g” but not q”~', whence o(a) = 4”. a 


We are ready for this chapter’s main theorem: 
Theorem 7.17 (The Primitive Element Theorem—Galois) Every Galois field has a 
primitive element. 
Proof. Let F be a Galois field and suppose it contains f elements. If the prime factor- 
ization of f —1 is 
f-1= py" py? +p, 
then, by the lemma, there exist elements 2,,2),...,4, € F such that o(@,) = p;" for all 


i=1,2,...,&. It follows from Proposition 7.7 that 


o(4,4,-+-a,) = py p,? +++ ppt =f -1. 


Hence 4,4, ---a, is the required primitive element of F. ] 


THE PRIMITIVE ELEMENT THEOREM 145 


It was pointed out above that Z, = GF(p, x — 1) so that Z, is also a Galois field and 
hence Theorem 7.17 guarantees the existence of primitive elements in Z,. ‘Thus, 3 is 


such a primitive element of Z,,, since its first 16 powers are 
1, 3,9, 10, 13, 5, 15, 11, 16, 14, 8,7, 4, 12, 2,6, 


ie., all the nonzero elements of Z,,. This sequence is, of course, identical with the 


exponents in the sum 
hg eae dl a Ae ce ent a ge a ae a a ee 


that was used in Section 2.4 to prove the constructibility of the regular 17-sided polygon. 

The primitive elements of Z,, whose existence is guaranteed by Theorem 7.17, are 
identical with the primitive roots (modulo p) that were defined in Section 5.2. It was 
Euler who first proved the existence of these primitive roots modulo p. Gauss expanded 
on this work of Euler’s and also applied it to his analysis of the cyclotomic equation 
x? —1=0. Galois, in attempting to generalize Gauss’s method to prove the resolvability 
of other equations, invented what came to be known as the Galois fields and observed that 
the same methods that were used to prove the existence of primitive roots modulo p could 
also be used to establish the existence of primitive elements in his fields (Theorem 7.17). 

The identification of primitive elements is an issue that puzzled Euler. Both Gauss 
and Galois later introduced some methodology into this question. Since this would 
take us outside the scope of this text we merely point out that trial and error can always 
be used to locate primitive elements in relatively small Galois fields. Thus, the Galois 
imaginaries a and £ of Section 7.1 are clearly primitive elements of GF(2, x? + x + 1) 


and GF(2, x? + x? + 1), respectively, whereas y is not a primitive element of 
GF(2, x4 +x3 +x? 4+x4+1) 


since y> = 1. However, the element 1 + y is a primitive element of this latter field. To see 
this, note that by Proposition 7.7 and Theorem 7.11, o(1 + 7) is a divisor of 15. However, 
by the Binomial Theorem (Theorem 6.2) 


(l+yP=14+3743r+yealtytyrtyFl 


146 


and 


GALOIS FIELDS 


(1+y) =145y + 10y? + 10y? 4+ 5724+ yp =1ltyt+r7'+7? 


=Hal+yt(ltytyty)+l=lty+y4l. 


Hence, o(1 +7) = 15 and so 1+y is indeed a primitive element of GF(2, x4 + x? + 


x? +x +1). We also note in passing that once it is known that a certain element & is 


a primitive element of some Galois field, then it follows from Proposition 7.7 that the 


other primitive elements of this field are the powers {” where m is relatively prime to 


p’-i. 


Exercises 7.3 


List all the primitive elements of the fields in Exercises 7.3.1 to 7.3.9. 


I. 
2. 
3. 
4. 
5. 


10. 


II. 


12. 


13. 


14. 


GF(2, x? +x +1) 6. GF(2,x4+x3 +x7+x+41) 
iG 8 
Ghee) 7. GF(3,x2+x+2) 
Zs, 
8. GF(5,x? +4x +2) 
Zi; 
GF(2, x4 +x + 1) 9. GF(7,x? +6x + 3) 


For any element a ¢ GF(p, P(x)), let r(z) denote the number of distinct elements 
6 in GF(p, P(x)) such that 677! =a. Prove that if a #0, then r(z)=0 or p—1. 


Let GF(p, P(x)) be any Galois field, and let 7 be a positive integer such that r #0 
(mod. p”—1), where v is the degree of P(x). Prove that the sum of the 7 -th powers 


of the elements of GF(p, P(x)) is zero. 


Prove that the product of all the primitive elements of GF(p, P(x)) is 1, unless 
GF(p, P(x)) is Z,, in which case this product is 2. 


Prove that if 2 is a primitive element of GF(p, P(x)) where P(x) has degree v, then 
eee 
i=l 


Prove that for every prime p, the polynomial x? — x ~1 is irreducible over Z,. 


ON THE VARIETY OF GALOIS FIELDS 147 


7.4 On the Variety of Galois Fields 


We now know that for every polynomial P(x) that is irreducible over Z, there is a 
corresponding Galois field GF(p, P(x)). This observation made it possible to construct a 
variety of new fields each of which contains p” elements, where p is some prime and vy is 
some positive integer. It is appropriate at this point to address the issue of classifying these 
new structures. There are two questions that every experienced mathematician would 
ask in this context. Given any such p and v, does there exist a Galois field of order p”? 
Given any two such Galois fields, when are they in fact one and the same? Unfortunately, 
the complete resolutions of these questions lie beyond the bounds of this book and the 
following discussion provides only informal answers. 

One way of proving that a field of order p” exists is to display a polynomial of degree 
y that is irreducible over Z,. This task is not easy. In fact, it turns out to be easier to 
count the number of all such polynomials than to produce even one. Exercises 6.2.6, 
6.2.7, and 7.3.14 deal with some special cases of this issue. 

Next, we turn to the second question and reexamine the first Galois field constructed in 
this chapter, GF(2, x? +x +1). This field was constructed by stipulating that a is a zero of 
the polynomial x? + x + 1 over Z, and then extracting some arithmetical consequences. 
However, our experience with all the previously constructed fields, namely, the rationals, 
reals, complex, and modulo p arithmetic, leads us to expect any quadratic to have two 
zeroes. Should we therefore stipulate the existence of another zero @’ of x” +x +1 over 
Z, and then proceed to create its Galois field? This is, of course, unnecessary, since 
a gives rise to a field that behaves exactly like that associated with a, except that each 
occurrence of « in the latter will be replaced by an a’. It turns out that the redundancy 


goes even deeper. The element «’ is already in GF(2, x? +x +1). In fact a =@”, for 
(7 +ertl=oltet+l=ate+1=0. 


A similar phenomenon occurs in GF(2, x? + x? + 1) whose Galois imaginary was 


denoted by 8. Note that 
(PP +(@)P +1 = f° +6541 =(648)+(1484+")+1=0 overZ,, 


implying that B* is another zero of x? + x? +1 in this field. Cubics, however, can have 


up to three zeroes, so we can expect yet another zero of this polynomial. The element f? 


148 —_ GALOIS FIELDS 
fails to be such a zero since 
(PP +(PY+1=—° + +1 =~ +(B+H)+1=1 +B#0 over Z,. 
However, §* turns out to be the third zero of x? + x* +1 since 
(+(e 2 +1=p+624+1=6'+8+1=(146)+6+1=0 over Z,. 


With hindsight, we could have argued as follows. Assuming that @ was a zero of xo + 
x? +1 over Z, it was shown that 8? was also such a zero. Hence, beginning with the fact 
that B? is a zero of this polynomial, it follows that its square, namely (8”)* = 4 should 
also be such a zero, as is indeed the case. Note that the square of 64 is B® = B which is 
also a zero of x? +x? +1, but not a new one. 

Let us examine another example before announcing the general principle. Consider 
the polynomial x” + x +2 which is irreducible over Z,, and whose cyclic table was 


constructed in Section 7.1 in terms of the Galois imaginary o. Since 
(oP?) +07 +2=01 +07 +2=24+(1+20)+2=2+20 40, 
it follows that a? is not another zero of x* + x + 2 over Z,. However, 
(FP +03 +2=09 +09 +2=(2+06)+(2+20)+2=0, 


and hence o? is the other zero of x* +x +2. So, of course, is (0°)? = 0” =o. The pattern 


is clear and is formulated in Proposition 7.19 below. 
Lemma7.18 Ifa, 6,c,...¢ GF(p, P(x)), then (a+ b+ce4+---)P =a? +b? +cP +---. 


Proof. Inasmuch as the Binomial Theorem (Theorem 6.2) holds in arbitrary fields, and 
since, as was argued in the proof of Proposition 5.14, es ) =0 (mod p) for O<k < p, it 
is seen that (a+ 6)? =a? + 6? for any a, b € GF(p, P(x)). The lemma now follows by 


an easy induction argument. | 


Proposition 7.19 If a is a zero of the polynomial P(x) over Z,» then so are 
aaa? al’... 


zeroes of P(x). 


ON THE VARIETY OF GALOIS FIELDS 149 
Proof. It clearly suffices to show that if a is a zero of P(x), then so is a? . Suppose 
= n nly... ‘ 
P(x)=ayx" tax" +---+4, x+a,, with a,...,4,€Z,. 


If & is any zero of P(x), then it follows from the Lemma 7.18 and Fermat’s Theorem 


(Theorem 5.15) that 


P(a?) = ay(a?)” + a,(a?)"" +--+4, 0? +4, 
= abla"? + alla! +--4a? a? ta? 


=(aja”+a,a""!+---+4, ,a+a,)? =0? =0. 


Thus, if a is any zero of P(x) so is a. . 


When P(x) is not irreducible, the list given in the statement of the above proposi- 
tion need not contain all of its zeroes (Exercise 7.1.15). When P(x) is irreducible, a 
stronger claim can be made. We state this without justification, and relegate the proof to 


Exercises 7.1.7 to 7.1.9. 


Proposition 7.20 _ If a is a zero of the irreducible polynomial P(x) of degree v over Z, , 
then 


2 5 vel 
Cs ER OE Cesta OF 


are all of its zeroes. 


We have argued that Galois fields that arise from different imaginaries that correspond 
to the same irreducible polynomial are in fact one and the same. Surprisingly, a similar 
phenomenon occurs when the Galois imaginaries correspond to different irreducible 
polynomials that have the same degree. Such is the case for the fields GF(2, x? + x* + 1) 
and GF(2, x? + x + 1) both of which have 2? = 8 elements. The cyclic table of the 
first of these appears in Section 7.1, and that of the second is displayed in Table 7.2 
(Exercise 7.1.1). These two tables are quite different. Nevertheless, these two fields are 
really one and the same. To see this observe that 6° of GF(2, x? + x* + 1) is also a zero 


of x? + x + 1, the same polynomial that gave rise to the Galois imaginary t. For, 
(@yP+tO+1l=P+H+1=6+H+4+1=0. 


Thus, having constructed the field GF(2, x? + x? + 1), there is no call for associating a 
new Galois imaginary t to the polynomial x? + x + 1 which is irreducible over Z,. The 


150 GALOIS FIELDS 


p=l garter 
gl=t pP=ltrtr 
Par P=l+r 


pP=l+r r=l 


Table 7.2. The cyclic table of GF(2, x° + x? + 1) 


element f° of GF(2, x? +x? + 1) is already a zero of this polynomial, as are (g°)” = p° 


and (°) = 8°. In other words, there is an unanticipated redundancy in the process 
Galois used to create his fields. This is what Galois had in mind when he wrote in the 


above quoted passage 


We are concerned here with the classification of these imaginaries and their minimization. 


It is very tempting at this point to argue as follows. If P(x) and Q(x) are irreducible poly- 
nomials of degree v over Z,» then, by Corollary 7.15 both GF(p, P(x)) and GF(p, Q(x)) 
consist of all the zeroes of x?’ — x, and consequently, these two fields should be one and 
the same. The flaw in this argument is exposed by the application of an analogous argu- 
ment to the polynomial x? +1. This polynomial has zeroes {2,3} in Z, and {4,13} 
in Z,,, yet we cannot conclude that {2,3} = {4,13} in any sense. Similarly, the zeroes 
that x? —x has in GF(p, P(x)) have, in principle, nothing to do with its zeroes in 
GF(p, Q(x)). The faultiness of this argument notwithstanding, its conclusion happens 
to be valid. We state it informally below since the careful formulation and the proof of 


this theorem lie beyond the scope of this text. 


Theorem 7.21 For any prime number p and any positive integer v there is exactly one 


Galois field containing p’ elements. 


The field whose existence is guaranteed by this theorem is denoted by GF(p”). Ac- 
cordingly, 
GF(2, x3 + x? + 1) = GF(2, x? +x + 1) = GF(2°) 


and GF(3, x? + x +2) = GF(3’). 

We conclude this section with another aspect of primitivity. A polynomial P(x) of 
degree v is said to be primitive over Z, if it is irreducible over Z, and its associated Galois 
imaginary — has order p”—1. In other words, P(x) is primitive over Z, when it is 
irreducible over Z, and its Galois imaginary & is also primitive. Since the construction 


of the cyclic table of GF(p, P(x)) depends only on P(x), rather than the specific &, it 


EXERCISES 151 


follows chat either all the zeroes of P(x) are primitive or none of them is primitive. Thus 
the polynomial x? + x? + 1 is primitive over Z,, whereas xitx3 +x274+x41 is nor. 
Finding a primitive polynomial is not an easy task. However, once such a polynomial 
has been found, it is easy to find all the other primitive polynomials of the same degree. 
Consider the polynomial x? + x + 2 which is known to be primitive over Z,. If o is its 


3 @°, and o’ are 


associated Galois imaginary, then, by Proposition 7.7, the elements o 
all the other primitive elements of GF(3, x” + x +2). By Proposition 7.19, o and o° are 
the zeroes of the same irreducible polynomial, as are 0° and (0°)? = o!° =o’. Thus the 


monic primitive quadratic polynomials over Z, are 
(x —0)(x —0°) =x? —(¢ +.0°)x +00? = x* (c+ 204+2)x+ot =x? +x42 
and 


(x —0°\(x —0’) = x*—(0° +0’) +00! = x? —(20 +04 1)x +0? = x7 42x 42, 


Exercises 7.4 


1. Use the fact that x? + 4x +2 is a primitive polynomial to determine all the monic 


primitive quadratic polynomials over Z,. 


2. Verify that the polynomial x” — x +3 is primitive over Z,. Write out its cyclic 
table and use it to find all the other seven monic quadratic polynomials that are 
primitive over Z.. 

3- Determine all the monic primitive cubic polynomials over Z,. 


4. Prove that if 2”—1 is a prime number, then every polynomial of degree y that is 


irreducible over Z, is also primitive. 


5. Prove that if p is a prime other than 2 and P(x) is an irreducible polynomial over 
Z, of degree greater than 1, then GF(p, (P(x)) has some nonprimitive elements 
besides 0 and 1. 


6. Suppose the polynomial a)x” + a,x”! +-++-+4,_,x +4, is primitive over a field 
F and ay,a, #0. Prove that the polynomial a,x” +4, )x” |+-+-+a,x +p is 


also primitive over F. 


152 GALOIS FIELDS 


7- Prove that for every element a of GF(p, P(x)) there is a positive integer & such 
that 


2 k-1 
a,aP,aP,...,a? 


ue k 
are all distinct, and a? =a. 


8. For any «¢ GF(p, P(x)), let 
M(x) = (x a)(x—a? x —a?)+-(x a?) 


where & is as defined in Exercise 7.4.7. Prove that M(x) € Z, [x]. 


9. Prove Proposition 7.20. 


By Proposition 7.20, any polynomial that has @ as its zero is divisible by the polynomial 
M(x) of Exercise 7.4.8. The polynomial M/_(x) is therefore called the (monic) minimal 
polynomial of a. Find the monic minimal polynomials for all the elements of each of the 


fields in Exercises 7.4.10 to 7.4.14. 


10. GF(2, x? +x +1) 12. GF(3, x* +2x +2) 14. GF(7,x?—x +3) 
11. GF(2,x4+x +1) 13. GF(5, x? +4x +2) 


15. Show that the conclusion of Proposition 7.20 need not be valid if P(x) is reducible 


over Z,- 


Chapter Summary 


Working by analogy with the complex numbers, Galois created a host of new fields, now 
bearing his name. The nonzero elements of these Galois fields are also roots of unity 
and as such have orders whose properties are indistinguishable from the orders of the 
complex and modular roots of unity. We proved the Primitive Element Theorem for 
these (and the modular) roots of unity. It was also pointed out that these fields are subject 
to some subtle relationships, since seemingly different fields may turn out, upon careful 


examination, to be identical. 


CHAPTER SUMMARY 153 


Chapter Review Exercises 


Mark the following true or false. 


I. 


Let P(x) be an irreducible quadratic over Z,, and let a be the associated Galois 


imaginary. Then there exist 2, 6 € Z, such that (3+ 4a)(a+ ba) =1. 


2. Let P(x) be an irreducible quadratic over Z,, and let « be the associated Galois 
imaginary. Then there exist 2, 6 ¢ Z, such that (2+ 4a)(a+ ba)=1+5a. 
3- Let P(x) be an irreducible cubic over Z,. Then GF(3, P(x)) has 30 elements. 
4- Let P(x) be an irreducible polynomial over Z,. Then the equation P(x) = 0 has a 
solution in GF(p, P(x)). 
5- Let P(x) be an irreducible polynomial of degree 4 over Z,. Then every element of 
GF(4, P(x)) is a zero of the polynomial «°° + x75 + x7 +x. 
6. The polynomial x? + 1 is primitive over Z,. 
7- There isan element r in Z,, such that the sequence 1, r, r7,... (mod 31) contains 
all the nonzero elements of Z,,. 
8. Let ae GF(2,x°+x* +1). If P(x) eZ,[x] and P(x) =0, then P(a”) =0. 
New Terms 
cyclic table, 135 Galois polynomial, 142 
derivative, 154 minimal polynomial, 152 
Galois field, 135 order, 136 


Galois imaginaries, 132 primitive, 135, 150 


154 


GALOIS FIELDS 


Supplementary Exercises 


I. 


2. 


Write a computer script that lists the primitive monic polynomials of degree d. 


Prove that for every positive integer d and for every prime p, there is a primitive 


polynomial of degree d over Z,. 


Find a formula for the number of monic irreducible polynomials of degree d over 


Z,. 


Write a computer script that will solve any polynomial equation Q(x) = 0 over 


any Galois field GF(p, P(x)). 
If F is any field and 


P(x) =agx"+a,x"!4+--4+a,  x+a,€F [x], 
then the derivative of P(x) is defined as the polynomial 
P!(x) = nayx” | +(n—Mayx"? +++ +4,_, € F [x]. 


Prove that this derivative has the following properties: 

(a) [P(x)+ Q(x)]/ = P’(x) + Q’(x) for any P(x), Q(x) € F[x]. 

(b) [eP(x)]/ =cP’(x) for any ce F and P(x) ¢ F[x]. 

() [P(x)Q(x)]! = P(x) Q(x) + P(x)Q'(x) for any P(x), Q(x) ¢ F[x]. 

(d) P(x) has repeated zeroes in F only if the greatest common divisor of P(x) and 
P’(x) has degree at least 1. 

Let p be any prime and 7 any positive integer. Prove that the polynomial x” — 1 


has repeated zeroes in Z, if and only if p is a factor of n. 


For how many primes p is 10 a primitive root modulo p? 


Chapter 8 


PERMUTATIONS 


Me By Lagrange’s solution of the quartic equation we consider the general 
question of what happens to a multivariable function when its variables are 
permuted. Some surprising results are derived and these lead us to a deeper examination 


of the notion of a permutation. 


8.1 Permuting the Variables of a Function I 


The key to Lagrange’s solution of the general quartic equation (Section 6.5) was the fact 
that when the variables of the polynomial x, x, + x3x, are permuted in all the possible 


ways so as to produce the 24 polynomials 


HX + XX yp, Hy Xy A AyXz,  XyHzg+H_Xy, Hy Xz + X4X>, 

Hy XgtXy_Xz,  XXyHXzXy, XX + HzX yy, XyXy + X43, 

XyXgFXyXgp XyXqz$XyxXyz,  XyXqtxXyXz, YX +XZXy, 

XgXy XX yg, XgX] FXGXq,  XgXQ XX, XzXq_ + X4X)> 

HgX gt XyXq,  XgXgtXy_Xy,  AX_HX_Xz,  XgXy + XZ%, 

XgX_t XX 3, XGX_ XxX, XXgP XX, XX, + X4%,, 
it follows from the commutativity of addition and multiplication that in fact only three 
distinct polynomials emerge, namely, x, x, + x3%4, x3 +%x,, and x,x,+%,x;. In 
general, when two polynomials P(x, y, z,...) and Q(x, y, z,...) can be obtained from 
each other by permuting their variables, these polynomials are said to be variants of each 


other. Thus, x,x,+x3x, has the 24 variants listed above, whereas x, x, +x, has the 


SIX Varlants x, x + X35 X_X| + Xz, XX + x2; X3X) +X, X4X3 +X, and X3Xq +X. ‘Two 


Introductory Modern Algebra, Second Edition. 155 
By Saul Stah! Copyright © 2013 John Wiley & Sons, Inc. 


156 PERMUTATIONS 


variants are said to be distinct variants if they differ as functions over C. Intuition is a 
fairly reliable guide in this context. When necessary, however, appropriate substitutions 
can be used to verify distinctness. Thus, the substitution of 1,1,0,0 for x,, x), x3, x4, 
respectively, proves that the variants x, x, + x3x, and x, x, + x,x, are distinct, since they 
assume the respective values 1 and 0. Lagrange’s observation can now be phrased as the 
polynomial x, x, + x,x, has three distinct variants. Similarly, the polynomial x, x, + x, 
also has three distinct variants. A function that has no distinct variants is said to be an 
invariant function. The polynomials x, x, and x, + x, + x; as well as all the elementary 
symmetric polynomials of Section 6.4 are examples of invariant functions. 

Since Lagrange’s solution of the quartic equation hinges on the existence of a polyno- 
mial of four variables with three distinct variants, it is reasonable, when attempting the 
solution of the fifth-degree equation, to look for polynomials in five variables that have 
four (or perhaps three) distinct variants. Surprisingly, such polynomials do not exist. 

Let us digress here and change the point of view somewhat. The five-variable poly- 
nomial x? + x7 +x} +x? + x2 is clearly invariant. On the other hand, the polynomial 


x, + x) +x; +x, —x, has five distinct variants, namely, itself and the four polynomials 


x; +X + X3 + Xe, — X yp x) +X, + X4 + Xs; — X3, 


Xx +X, + Xy + Xe — Xp, Xy + X3 + X44 + Xs — xy. 


Similarly, the polynomial x, x, + x,x,x, has (3) = 10 distinct variants, namely, itself 


and the nine polynomials 


x X3 + HX gX5o XX y + X2X3X5  X Xs, + X4%X3X4> 
XyXz XX 4X5, X_Xy + X1%3X—5 XX +X X3X4, 


X3X4 + XY X_X5, 3X5 PX HX, X 4X5 HX X_X3. 


However, is there a function of these five variables that has four distinct variants? A formal 
proof of the nonexistence of such a function is offered below in Corollary 8.11. The 
need for such a proof is underscored by the fact that the beginner’s search for functions 
of five variables that has two distinct variants is very likely also to meet with failure. 
Such functions, however, do exist, and a method for constructing them is suggested in 
Exercise 8.1.36. A complete proof of the existence of such two-variant functions is offered 


in Proposition 8.14. 


EXERCISES 157 


That there are no functions of five variables that have four (or three) distinct variants was 
first recognized by Paolo Ruffini, who incorporated this observation into his unsuccessful 
attempt to prove the unsolvability of the general quintic equation by radicals. Ruffini’s 
theorem regarding the number of distinct variants that a function of five variables can 
have was generalized in 1815 by Cauchy to the statement that appears as Theorem 8.10 
below. This proposition was incorporated by Abel into his groundbreaking proof of the 
unsolvability of the general quintic equation by radicals. In 1847 Cauchy returned to this 
topic and proved that if a function on ” > 5 variables has fewer than 7 distinct variants, 
then that function has only one or two distinct variants. Since this stronger version 
turned out to play no special role in the evolution of the theory of algebraic resolvability 
of equations, it is mentioned without proof. We shall, however, prove Cauchy’s 1815 


theorem after providing some basic theoretical information about permutations. 
Exercises 8.1 


Find the number of distinct variants that the functions in Exercises 8.1.1 to 8.1.35 have. 


I. x, +% §- x, /x,+x,/x, 9. x, x7 

2. x —% 6. sin(x—y) 10. seeks 

3. (4 — x) 7 cos(x — y) ee aed 

4 x, /x) Be a Pm 12. (x, +%),—%3)" 
13. (x, +x,)(x, + x3)(x) + x3) 20. x, ay 

14. (x1 — Xp )(%4 — ¥5)(%2 — 3) as ae ae a 

IS. xx) +4; 22. (x)%— x44) 

16, x,/x, +X; 23. (x, + x))/(%3%54) 

Df Mngocg 24- (x — xy)" + (3 — 4)? 

18. X1X4X3X4 25+ xX, Reese 

IQS. Go cease? 26. (1 ~ Xp )(%, — 3 )(%, — 5), 


27~ (x, + Xy)(xy + 3 )(x, + 4 )(H_ +5 )(Xy + X4)(%y + x4) 


28. (x, — Xy)(% — 3%, — X4)(X_ — X3)(%5 = x4 )(%3 — x4) 


158 PERMUTATIONS 


2D. xy XyX3X4 Xe 30. x, XyX3x4/%5 31. (x, x,x5 + x4)/X5 
32. (x, x>x3)/(x4x5) 34. (x, — x4)(x, — 3)(%) — x3) x4 Xs 
33. (x) x) + X%3X4)X5 35. se mek Ne 


36. Use the answers of Exercise 8.1.14 and Exercise 8.1.28 to create a function of five 


variables that has two distinct variants. 


Prove that for any positive integer 7 there exists a function of 7” variables that has the 


number of distinct variants that is specified in Exercises 8.1.37 to 8.1.43. 


37-1 40. 3(3), 223 43. Aj) n>k>1 
38. 7 41. 3(7), 24 
39. (5), n>2 42. (7), n>k>0 


44. Show that for any positive integer 7 there is a function of 7 variables such that 


every two of its variants are distinct. 


8.2 Permutations 


In the previous section we had several occasions to shuffle some variables and to observe 
the effect that this transformation had on a function of these variables. We now focus 
on the shuffles themselves. The mathematical name for such a shuffle is a permutation. 
More formally, a permutation of a set S is a function o that assigns to each element x of 
S an element y = o(x) of S so that 

(a) if x, and x, are distinct elements of S, then o(x,) # o(x); and 

(b) if y is any element of S, then there is an element x in S such that y = o(x). 
We note in passing that when the underlying set S is finite these two conditions are 
equivalent (Exercise 8.2.35) and so only one need be verified. The identity permutation 
that transforms each element to itself is denoted by Id. In the earlier days of permutation 
theory, each permutation was written as an array of two rows. The first of these rows listed 
the elements of S in some natural order, and the second row listed the corresponding 


values of o. Thus, the array 
Hy By yy 


* x3 x4 xX 


was used to denote the permutation o such that 


o(x;)=x;,, (addition modulo 4) 


PERMUTATIONS 159 


and whose effect is to convert the polynomial x, x, + x,x, to the polynomial x,x, + x4%,. 


Similarly, the permutation that interchanges x, with x, and also interchanges x, with 


& X> x3 i) 
X3 X41 Xy 


The letter x above serves merely as a place holder and it is more efficient to eliminate it. 


x, was denoted by 


Thus, we shall generally restrict our attention to permutations on a set S = 1, 2,...,7, 


and the above two permutations can be denoted by 


1234 1234 
and , 
2341 3412 
respectively. This notation will be further improved, but first we pause to count the 


permutations of a given set. 


Proposition 8.1 For every positive integer » the number of permutations of the set 
Bi VAS bases i lic elle das jonry 


123..k...0 
abc..h..j 


denote an arbitrary permutation of S = {1,2,3,...,2}. Then the symbol a can be 


Proof. Let 


replaced by any of the 2 elements of 5. Once a has been chosen, 6 can be replaced by 
any of the 7— 1 elements of S— {a}. Once & has been chosen, ¢ can be replaced by any 
of the n — 2 elements of S —{a, 6}. Proceeding in the same manner, it is clear that the 
second row of this arbitrary permutation can be filled out in n(n —1)(n—2)---1=n! 


ways so as to define a bona fide permutation of S. . 


Like all functions, permutations can be composed, and this composition is associative 
(Exercise 8.2.29). If @ and o are two permutations of the set S, then their composition is 


denoted by either 0 oo or simply by their juxtaposition oo where oa(2) = e(o(2)). Thus, 


if 
7 1234 d 7 1234 
ead os ye ene a we 


160 PERMUTATIONS 


then 
poo(1)=00(1) = olo(1))=0(2)= 1 
p00(2)=o0(2) = o(o(2)) = (3) = 4 
0.0(3) = 00(3) = (0(3)) = o(4) = 2, 
e.00(4) = p0(4) = e(o(4)) = e(1) = 3. 
In fact, 


_ - 1234 
iia sal Gea aoe 


This situation is quite typical and merits an explicit statement and perhaps also a formal 


proof. 


Proposition 8.2 If 9 and o are permutations of the set S, then so is their composition 


oo a permutation of the set S. 


Proof: Suppose x, and x, are distinct elements of S. Then, since both ¢ and o are 
known to be a. it follows from the definition first that o(x,) # o(x,) and next 
that e(o(x, )) # e(o(x,)), or eo(x,) # eo(x,). Thus, the composition po also satisfies the 
first required property. 

Similarly, if y is any element of 5S, then by definition there exists an element z of S 
such that o(z) = y, and, by the same argument, there exists an element x of S such that 


o(x) = z. Combining these two we see that 


00(x) = o(o(x)) = o(z) = y 


so that the composition go also the second required property. Thus go is a permutation 


of S. r 


2 


If o is any permutation, then we define o° = Id, o! =o, o” =a, and, if & is any 


positive integer, o* is the composition of & o’s. Accordingly, if 


12345 12345 12345 
= , then o?= and of = ; 
23451 34512 45123 


PERMUTATIONS 161 


We now describe yet another, more efficient, way of writing down permutations of 


finite sets. We first define a cycle, or a cyclic permutation, as a permutation of the form 
abec...gh 
bed...ha 

and agree to write it in any of the forms 


(abc...gh)=(bc...gha)=(cd...hab)=---=(hab...¢). 


If & is any positive integer, then a &-cycle is a cycle that contains & elements. Thus, 
(357) isa 3-cycle and (18725) isa 5-cycle. Suppose next that o is an arbitrary 


permutation and a is an arbitrary element of the underlying set S. Consider the sequence 
o°(a)=4, o\(a)=o(a), o°(a), 07(a), .... 


Since S is finite, this infinite sequence must contain repetitions. Let & be the first 
exponent for which there exists another exponent m > & such that o*(a) =0”(a). The 


exponent & must in fact be 0, since otherwise we would have 
o(o*1(a)) = o*(a) =o"(a)=0 (o”"1(a)) 


and by the first property of permutations we would have of~!(a) = o”—!(a), contradicting 


the minimality of &. Hence the elements of 

a, 0(a), 0(a), 07(a),...,0-"(a) (8.3) 
are all distinct and o(o*~1(a)) =a. If we define a, to be the cycle 

(a 0(a) 0*(a) 07(a)... o*-'(a)), 


then it is clear that o and a, agree on all the elements of List 8.3. If List 8.3 does not 
exhaust all the elements permuted by a, let 6 be an element that does not appear in 


List 8.3, and let 4 be the least positive integer such that o(6) = 6. We then define 


3, =( 6 o(b) 0°(b) 03(b)...0-"(6)). 


162 PERMUTATIONS 


If this process is repeated until all the elements permuted by o are exhausted, we have 


cyclic permutations o,,¢,,... such that 
o=0,0,°°° (8.4) 


Thus, for example, 


Cree 


95 576 nag )Ttt9843K6K257) 


We shall refer to Equation 8.4 as the disjoint cycle decomposition of o. Note that the order 
in which the individual cycles in the disjoint cycle form of a permutation are written is 
arbitrary. Similarly, it is only the cyclic order of the elements that appear in a cycle that is 


significant. Thus, 


(1234)(567)(89)=(567)(1234\89) 
=(89)(567)(1234)=(3412)(89)(675). 


If this process gives rise to a cycle of length 1, that cycle is generally omitted. Accordingly, 


(on 


rocco nae 


Ic is clear that if o =(a, 4, 4,...4, ,4,)ando=(a,4,_,...43, a, a, ), then pa =09= 
Id. We say that p and o are inverses of each other and write o = 97! and p=o"!. Even 
when ¢ is not necessarily cyclic, it has an inverse and this inverse is easily described. To 
nl gn 


see this, let ¢ =0!07-+-¢ be the disjoint cycle decomposition of o. If we now set 


ee pi) eee pee 
P =F, Fy-1 °° F2 A > 


then 
— eee ~1 —1 eee 1 oe eee —1 eee —1 le Akos 
OP =0,0,++'°G, 19,0, F,_)°°°0, FO, =0,07°°°G, 10, 1°97 O) = =I. 


Similarly, oc = Id. Thus, the inverse of the permutation (19 8 43 )(6)( 257) is the 
permutation (752)(6)(34891). 


PERMUTATIONS 163 


It is clear from this description of inverses that every permutation has a unique inverse. 
If we set 0°” =(o—!)” for every nonnegative integer, then the powers of permutations 
obey the usual exponential rules (Exercise 8.2.28). The following lemma, however, is not 


quite so obvious. 


Lemma 8.5 If @ and a are permutations of the same set, then (o0)"' =a7'p'. 


Proof. The proof is a straightforward application of the associativity of the composition 


of permutations. Note that 


(e2)(o7'p"!) = p(a0') 0! = pldo! = pe! =Id 


and 


(o-!07!) (ea) =o"! (eo e)o =o! Ido=o'o=ld. 
Thus, o”'o7! fulfills the requisite conditions for being the inverse of pa. " 


The analysis of the effects of permutations is often facilitated by factoring them into the 
composition of “smaller” permutations. A transposition is a permutation that interchanges 
only two elements, leaving all the others fixed. Thus every transposition has the form 
(a6). Icis clear that in an informal sense the transpositions are the smallest nontrivial 
permutations. The equations (123)=(12)(23) and(1234)=(12)(23)(34) 
are instances of nontranspositions expressed as the composition of transpositions. This is 


always possible. 


Proposition 8.6 Every permutation of a finite set is the composition of some transposi- 


tions. 


Proof. We already know that every permutation of a finite set is the composition of 
cyclic permutations, and hence it suffices to show that every cyclic permutation can be 
expressed as the composition of some transpositions. This, however, is easily accomplished 


as follows: ( a, 4, 43... 4, )=(4 4, )( 4, a,)-+-(4,_)4, ). a 


Thus, 


(153)(2489)abede)=(15)( 53) 244889 ab bc\cd\de). 


164 PERMUTATIONS 
It should be stressed that such expressions are not unique as illustrated by the equations 
(123)=(12)(23)=(13)(12)=(23)(13) 
=(12)(34)(24)(34)=(23)(13)(23)(34)(24)(34). 


It is clear that if 0 = (4, a, ... 4, ) is any cyclic permutation then o* = Id. Consequently, 
if m is any common multiple of the lengths of o,,0,,...,0, in the disjoint cycle factor- 


ization of the arbitrary permutation o = 0,0,---0,, then o” = Id. Thus, 
((123)(45))° =I. 


The order o(c) of the permutation a is the least positive integer m such that o” = Id. 
The above considerations make it clear that every permutation of a finite set has a finite 
order. Exercise 8.2.31 asserts that the order of any permutation equals the least common 


multiple of the lengths of its disjoint cyclic factors. 
Exercises 8.2 


Express the permutations in Exercises 8.1.1 to 8.1.4 in the disjoint cycle form. 


es Ree 3. (ooo eee) 
934715268 987654321 
a Cee) 4. (eee 
534716298 975318642 


5- List the permutations of {1,2} in disjoint cycle form. 

6. List the permutations of { 1, 2,3} in disjoint cycle form. 

7. List the permutations of { 1, 2, 3,4} in disjoint cycle form. 
Given the permutations 9p =(1234)(567)(89) ando=(19865)( 2347), express 
the permutations in Exercises 8.2.8 to 8.2.14 in disjoint cycle form and compute their 
orders. 

8. eo Io. ede 12. poo! 14. oo 


9. oo II. oogo 13. ooo! 


EXERCISES 165 


Express the permutations in Exercises 8.2.15 to 8.2.18 as a composition of transpositions. 


15. 


16. 


19. 


20. 


21. 


22. 


23. 


123456789 
(123)(4567)(89) ifs Cone 
123456789 
(1429)(5637) 18. total gaeea) 


Prove that if 7 > 2, then every permutation of {1,2,...,} can be expressed as 
the composition of transpositions of the form (1a), where a = 2,3,...,7. 

Show that if n > 4, then every permutation of { 1,2,...,} is expressible as the 
composition of 4-cycles. 

Show that if 2 > & > 2 and & is even, then every permutation of {1,2,..., 7} is 
expressible as the composition of &-cycles. 

Prove that if o and o are permutations of {1,2,...,2}, then 0(c) = o(o0e7'). 
Prove the following: 

(a) If any cycle of the permutation o has the form (a, 4, a,... 4, ), and if o is any 


other permutation, then pcp! has a corresponding cycle of the form 


( o(4;) (42) e(43).-. o(4,) ). 


(b) For each positive integer &, o and eoe™! have the same number of &-cycles. 
(c) If the two permutations o and rt possess the same number of &-cycles for each 


positive integer &, then there is a permutation ¢ such that t = e007! 


The number of cycles in the disjoint cycle form of the permutation o on {1,2,..., 7} is 
denoted by |lo||. Thus ||( 123 )(45)(67 89 )|| =3 if 2 =9, and ||Id|| = 7 regardless 


of the value of 7. 


24. 


25. 


26. 
27. 


28, 


Let o be any permutation and let t be any transposition on the same set. Prove 
that |loz|] = |lo|] + 1. 

Prove that if o and ¢ are any two permutations, then ||ea0™' || = {lol}. 

Prove that if o and ¢ are any two permutations, then ||¢o|| = ||oo||. 

Prove that every permutation of {1,2,...,7} is expressible as the composition of 
two permutations of { 1, 2,...,} that have order at most 2. 


Prove that if o is any permutation and m and 7 are any integers, then oo” =0”"*” 


and (o”)” =o”. 


166 


29. 


30. 


31. 


32. 


33- 


34. 


35. 


36. 


8.3 


PERMUTATIONS 


Let S, T, U, and V be sets, and let f bea function from S to T, g a function 
from T to U, and / a function from U to V. If the composition g o f is defined 
via go f(x) = g(f(x)), prove that ho(go f)=(hog)of. 

Find two permutations o and o that have relatively prime orders and for which 
000) # o(0)0(0). 

Suppose ¢ = 0,0,:--a, is the disjoint cycle form of o, and suppose that each 
factor o, contains m, elements of S for i = 1,2,...,&. Prove that o(c) is the least 


common multiple of m,, m,...,m,. 


Prove that the order of every permutation on {1,2,...,} is a proper factor of n! 
if n> 2. 


For any integers & > 0 and n> 0, let s(x, &) denote the number of permutations 
of {1,2,...,2} whose disjoint cycle decomposition has exactly & cycles. Prove 
that if & > 1 and m > 1, then s(n, &) =s(n—1,k-1)+(n—1)s(n—-1, &). 

Show that the average number of cycles in the disjoint cycle decomposition of all 
the permutations on {1,2,...,2} is 1+1/2+1/3+---+1/n. 

Prove that if the set S is finite and o is a function of S into itself, then the following 
conditions are equivalent: 

(a) If x, and x, are distinct elements of S, then o({x,) # o(x,). 


(b) If y is any element of S, then there is an element x in S such that y = o(x). 


Show, by means of examples, that when S is an infinite set, neither of the conditions 


of Exercise 8.2.35 need entail the other. 


Permuting the Variables of a Function II 


We now return to the issue of the number of distinct variants of a given function of several 


variables. We begin by formalizing the notion of interchanging the variables of a function. 


If f = f(x,,x,...,x,,) is any function of 7 variables, and if o is any permutation of the 


indices, then we define 


af =f (Aaupmyiers tn) 


If f = x,x, +x,x, and o is the cyclic permutation (1234), then of = x)x; + x,%,. 


Lemma 8.7 If f and g are two functions of the variables x,, x,,...,x, such that f = g, 


and if o is any permutation of { 1, 2,..., n},thenof =og. 


PERMUTING THE VARIABLES OF A FUNCTION I 167 


The next two observations are easy, but fundamental. Their justification relies on the 
fact that in the composition go it is o that acts first on the permuted elements, and its 


action is followed by that of 0. 


Lemma 8.8 If f is any function of the variables x,, x,,...,x,, and if o and o are any 


two permutations of {1,2,..., 2}, then (op) f =o(of). 
Corollary 8.9 If f=of,then f=o'f. 


Proof. If f =of, then, by Lemmas 8.7 and 8.8, 


of =o"\(of)=(o-e) f =(Id)f = f. 8 


If it so happens that o f = f, we say that o leaves f unchanged. Thus, (13 )( 24) leaves 


x, X> + %x,x, unchanged. We are now ready to state and prove this section’s main theorem. 
1°2 3°°4 


Theorem 8.10 (Cauchy) Let f be a function of variables, and let p be any prime 
such that p <7. If f has & < p distinct variants, then & is either 1 or 2. 


Proof. Let f and k < p <n beas in the statement of the theorem. We first show that if o 


is any permutation of order p of {1,2,...,2}, then of = f. Consider the p functions 


fof.of,...,0° | f. 


Since f has only & < p different variants, it follows that there exist two distinct integers 


rand s with 0< r <s < p such that 
of=of or o&’f=f. 


It is therefore clear that o“('-”) f = f for every integer a. Since O<s—r <p, s—ris 
relatively prime to p so that there exist integers A and B such that A(s—r)+ Bp =1. 


But then, bearing in mind that o? = Id, 


feck f =a FP" f =o(o#? f)=o(ld *f)=of. 


Thus we have proved that if o is any permutation of order p, then of = f. 
Next we show that the application of any two transpositions to the variables of f also 


leaves f unchanged. Let a and f be the two specific permutations 


a=(1234...p) and B=(p p—1...4231). 


168 | PERMUTATIONS 
Since they both have order p, both leave f unchanged. Consequently, their composition 
Ba=(132)=(32)(21) 
also leaves f unchanged. By a similar argument, the permutation 
(243)=(43)(32) 
also leaves f unchanged, as must their composition 
(43)(32)(32)(21)=(43)(21). 


Since there was nothing special about the choice of the variables x,, x), x,, and x,, we 
now know that f is unchanged whenever its variables are permuted by two, or any even 
number of consecutive transpositions. 

Finally, we demonstrate that if o and o are any permutations expressible as the compo- 
sition of an odd number of transpositions then of =a. Since it is already known that 
every permutation is expressible as the composition of some number of transpositions, 
this will conclude the proof of the theorem. Note that it suffices to show that for any such 
o,(12)f =of. However, since o is the product of an odd number of transpositions, 
it follows that (1 2 )o is the product of an even number of transpositions, so that by the 


previous argument 
f=(12)of 
and so 


(12)f =(12)(12)of =of. 


Thus, in general, every variant o f of f equals either f or (12) f, depending on whether 
o is expressible as the composition of an even or an odd number of transpositions. Of 
course, it could happen that f and (12) f might be equal, in which case f is invariant. 


In other words, & is either 2 or 1. r 


While the above theorem was first stated and proved by Cauchy in 1815, the proof we 
gave is based on that which appears in Abel’s 1826 memoir. A translation of Abel’s proof 


is presented in Appendix C. 
Corollary 8.11 (Ruffini) There exists no function of five variables that has either three 


or four distinct variants. 


Exercises 8.3 


EXERCISES 169 


For which of the values of 2 and & in Exercises 8.3.1 to 8.3.26 does there exist a function 


of n variables that has & distinct variants? Justify your answers. 


I. 


2. 


n=1,k=1 10. 
n=2,k=1 Il. 
n=2,k=2 12. 
n=3,k=1 13. 
n=3,k=2 14. 
n=3,k=3 15. 
n=4,k=1 16. 
n=4,k=2 17. 
n=4,k=3 18. 


The Parity of a Permutation 


n=4,k=4 
n=5,k=1 
n=5,k=2 
n=5,k=3 
n=5,k=4 
n=5,k=5 
n=5,k=10 
n=6,k=1 
n=6,k=3 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


n=6,k=4 
n=6,k=6 
n=7,k=6 


n arbitrary, k =n 


n>r>0,k=(7) 


It was seen above that every permutation can be expressed as the composition of transpo- 


sitions. Moreover, in the last paragraphs of the proof of (Cauchy’s) Theorem 8.10, the 


issue of the parity of the number of factors in this expression became significant. It was 


noted earlier that any permutation can be factored into transpositions in many ways, as 


illustrated by the equations 


(123)=(13)(12) 


=(13)(14)(24)(14) 


=(13)(12)(24\(12)(24)(14). 


However, as indicated by this example, the parity of the number of transpositions in any 


such factorization of a given permutation is fixed, a fact that we now set out to prove. 


For every integer > 2 we define the discriminant A, as the polynomial 


Ay, = (1 — %p)() — 5) +++ (%] — 2% = 5) 


—x,): 


170 PERMUTATIONS 
Accordingly, 


A, =X, — x» 

A, =(x, — x,)(x, —x3)(x, — x5), 

Ag = (x, —%)(x,- %3)(x, — x4 )(_— %3)(x, — X4)(X3 Sag 
It will now be demonstrated, as promised in Section 8.1, that for each integer 7 > 2 the 
discriminant A, has two distinct variants. This fact, in curn, will be used to draw some 


interesting conclusions regarding the parities of permutations. The brunt of the work is 


contained in the proof of the following observation. 
Lemma 8.12 If r is any transposition, then tA, =—A,. 


Proof. Suppose first that t = (i i+1) with 1 <i <x. The effect of t on A, is to replace 


the segment 
(5 = X41; — Xn) — MK 41 — X42) O41 — Xn) 


with 
(Xin — p41 — Xena) + 41 — MA; — Xi4a) 0%; — 


Since (x,,, — x;) = —(x; — 


t t 


tA =A. 


n n 


X;41)> it follows that for t= (i +1) we do indeed have 


Next let t be an arbitrary transposition (a 6 ) with a < 6. Since 


(a6)=(aatlat2...b-16)( 6-1 b-2...a+1a) 


=(aa+tl )(a+1 a+2)---( 6-2 6-1) 6-1 6 )( 6-1 6-2) 6-2 6-3) 


-+-(at+2 atl (atla) (8.13) 


and since the right-hand side of Equation 8.13 consists of an odd number (specifically, 
2(6 —a)— 1) transpositions of the form (i i+1 ), it follows from the first part of the 


proof that in this case too tA, = —A,. . 


A permutation is said to be an even permutation or an odd permutation if it is expressible 
as the composition of an even or odd number of transpositions, respectively. Thus, every 


transposition is necessarily odd whereas every 3-cycle of the form (a & c ) is even since 


THE PARITY OF A PERMUTATION — 171 
(abc)=(ab\ bc). Similarly, 
(123)(4567)=(12)(23)(45)(56)(67) 
is an odd permutation whereas 
(12345) 67) 89a6b)=(12)23)(34)( 45) 67) 89) 9a) ab) 


is an even permutation. It follows from Proposition 8.6 that every permutation is neces- 


sarily either even or odd, or both. However, it follows from Lemma 8.12 that 


A, if o is even, 


—A, ifo is odd. 


oA = 


n 


Since A, # —A, (Exercise 8.4.27), we conclude that no permutation o can be both odd 


and even. This is an important fact that deserves being stated as a proposition. 


Proposition 8.14 A permutation o of {1,2,...,2}, m 22, is either even or odd if 
oA, =A, or —A,, respectively. Consequently, no permutation can be both even and odd, 


and A, has only two distinct variants. 


An alternate proof of this proposition is indicated in Exercise 8.4.17. The parity of a 
permutation is its evenness or oddness. 

Since it was already seen in the proof of Proposition 8.6 that every &-cycle can be 
expressed as the composition of & — 1 transpositions, the parity of a permutation can be 
easily computed from its disjoint cycle decomposition. Specifically, if o,¢,---,, is the 
disjoint cycle decomposition of a, where each a, is a k,-cycle, then o can be expressed as 
the composition of 5°”. ,(&; — 1) transpositions. Thus, the parity of o is identical with 
the parity of the integer 0” ,(&;, — 1). In particular, the parity of (1234)(567) is 
the same as the parity of (4—1)+(3—1)=5 which is odd. 

This notion of parity can be used to give another convincing example of the utility 
of permutations in proving negative results. The well-known 15-puzzle consists of 15 
square pieces, numbered 1 through 15, that are placed inside a larger square frame as 
indicated in Figure 8.1. A legitimate move consists of the sliding of a neighboring piece 
into the empty space. Figure 8.1 describes the effect of several successive legitimate moves. 
One now faces the challenge of rearranging the pieces into any prescribed configuration 


by means of legitimate moves alone. It turns out that some configurations, such as the 


172 PERMUTATIONS 


Figure 8.1 Legitimate moves for the 15-puzzle 


configuration R (for reverse) of Figure 8.2, are in fact unattainable by means of legitimate 
moves. Parities of permutations can be used to give a rigorous proof of this fact, and we 
shall do so here. 

Let J be the initial configuration of the 15-puzzle as described in Figure 8.1. To 
every prescribed configuration X of the 15-puzzle we assign a permutation Py of the set 


{1,2,3,...,15, b} by 


THE PARITY OF A PERMUTATION 173 


(a) thinking of the empty space as just another piece labeled 6 (for blank); and 
(b) setting P,(z) to be the label of the piece that occupies in X the same location that 
i has in J. 


For the configurations of Figure 8.1, 


Note that the sliding of some square labeled x, say, into the empty space, is tantamount 


to the transposition ( x 5 ) and hence we have the following lemma. 


Lemma 8.15 If the configuration Y is obtained from the configuration X by sliding 
the piece x into the empty space, then Py =(x 6 )Py. 


In Figure 8.1, F is obtained from E by sliding 7 into the empty space, and indeed 
P, =(7812)( 61511) =(76 (78126 1511) =(7 6 Py. 


It follows from this lemma that if a configuration X is obtained from the initial 
configuration J by a sequence of m moves, then Py can be expressed as the composition 
of m transpositions. 

We will now show that the configuration R of Figure 8.2 is unattainable by legitimate 


moves. Observe that 
Pp=(115)(214)(3 13) 412)(511)(610)(79) 


is an odd permutation, since it is expressed here as the composition of seven transpositions. 
On the other hand, since the empty space occupies the same positions in the initial 


configuration J and in R, it follows that a sequence of legitimate moves leading from J 


174 PERMUTATIONS 


Figure 8.2 An unattainable configuration 


Figure 8.3 A coloring of the 15-puzzle 


to R must consist of an even number of moves. One way to justify this assertion is to note 
that since such a sequence of moves terminates with @ in its original location, the number 
of vertical moves must have been even, as must have been the number of horizontal 
moves. An alternate justification is obtained by coloring the underlying framework in 
the checkerboard pattern of Figure 8.3. Each legitimate move then changes the color 
showing in the empty space. Since the empty space returned to its original position, 
this sequence must consist of an even number of moves. Either way, Proposition 8.14 
guarantees that the odd permutation P, is not expressible by means of an even number 
of transpositions and hence the configuration R is not attainable by legitimate moves. 
A configuration X is said to be standard if the empty space occupies the same position 
in X as it does in J, ie., if Py(&) = 4. It is clear that the above argument proves the 


unattainability of any standard configuration X for which P, is an odd permutation. 


Proposition 8.16 A standard configuration X is attainable from the initial configuration 


/ ifand only if Py is an even permutation. 


Sketch of proof. If X is a standard configuration with Py odd, then the argument that 


was applied to prove the unattainability of R above accomplishes the same goal for X. 


EXERCISES 


Figure 8.4 Either X or Y is attainable 


175 


The proof of the converse is based on the observation (Exercise 8.4.29) that regardless of 


the values of the a,’s, one of the two configurations of Figure 8.4 is necessarily attainable 


by legitimate moves. Moreover, the permutations Py and Py associated with these 


configurations are related by Py =(a,,4 4,5 )Py. Consequently, if P, is even, then Py 


is odd and hence Y is unattainable. Since we have argued that either X or Y must be 


attainable, it follows that X must be attainable. 


Exercises 8.4 


Determine the parities of the permutations in Exercises 8.4.1 to 8.4.4. 


wwe WN 


° even odd 


even | even odd 


odd | odd even 


Table 8.1 Parities of permutations 


(eee) 3. ) 
934715268 987654321 
(oe) 4. Ce eee 
534716298 975318642 


List all the even permutations of { 1,2} in disjoint cycle form. 


List all the even permutations of { 1, 2,3} in disjoint cycle form. 


List all the even permutations of { 1, 2,3, 4} in disjoint cycle form. 


List all the odd permutations of { 1,2} in disjoint cycle form. 


List all the odd permutations of { 1, 2,3} in disjoint cycle form. 


176 


10. 


il. 


12. 


13. 


14. 


15. 


16, 


17. 
18. 


19. 


20. 


21. 


22. 


PERMUTATIONS 


List all the odd permutations of { 1,2, 3,4} in disjoint cycle form. 


Prove that the effect of composition on the parities of permutations is described by 
Table 8.1. 


Prove that if 9 and o are any two permutations, then 
(a) oe and ea have the same parities; 


(b) o and goo"! have the same parities. 


Prove that every even permutation is expressible as the composition of 3-cycles and 


that this is not true for odd permutations. 


Let & > 1 be an odd positive integer. Prove that every even permutation of 
{1,2,...,2}, n= &, is expressible as the composition of &-cycles and that this is 


not true for odd permutations. 


Prove that for 2 > 3 every even permutation of {1,2,...,2} is expressible as a 


composition of 3-cycles of the form (1.46) and (12a). 


Determine for which 7 there exist cyclic permutations o and t of {1,2,...,2} 


such that or =(123... 2). Prove your answer. 
Use Exercise 8.2.24 to prove Proposition 8.14. 


Which of the configurations of the 15-puzzle in Figure 8.5 are attainable by legiti- 


mate moves? 


Formulate and prove a necessary and sufficient condition for an arbitrary (not 


necessarily standard) configuration to be attainable by legitimate moves. 
Which of the configurations of Figure 8.6 is attainable? 


How many of the standard configurations of the 15-puzzle are attainable and how 


many are not? Prove your answer. 


Prove that for 7 > 1 , exactly half of the permutations of { 1,2,...,2} are even. 


For the values of 7 and & specified in Exercises 8.4.23 to 8.4.26, does there exist a 


function of 7 variables that has & variants? Justify your answers. 


23. 
24. 


27. 


28. 


n>3,k=2n 25. n>3,k=2(3) 
n>2,k=2(3) 26. n>r+1>2,k=2(7) 
Let o be a permutation of {1,2,...,}. Prove that the parity of the permutation 


o is the same as the parity of the integer  — |lo|]. 


Prove that for 7 > 2, A, #—A,. 


CHAPTER SUMMARY 177 


a) 


c) 


Figure 8.5 | Some configurations of the 15-puzzle 


Figure 8.6 More configurations of the 15-puzzle 


29. Complete the proof of Proposition 8.16 by showing that regardless of the values 
of the a,’s, one of the two configurations of Figure 8.4 is necessarily attainable by 


legitimate moves. 


Chapter Summary 


Motivated by Lagrange’s solution of the general quartic equation, we studied the number 
of distinct variants that a multivariable function can have. It was shown in Theorem 8.10 
that there are some strong and surprising limitations on the number of such variants. 
The proof called for a deep analysis of the structure of permutations. The same proof 


also led to the formulation of the notion of the parity of a permutation. Finally, these 


178 PERMUTATIONS 


Figure 8.7 A configuration of the 15-puzzle 


new tools, developed with functions in mind, were applied toward the resolution of the 


popular 15-puzzle. 
Chapter Review Exercises 


Mark the following true or false. 
1. The number of distinct variants of the function x,(x,—.x,)” is six. 
2. The number of permutations of { a, 6, c,d,e} is 120. 
3. (123)(4765)17623)(45)=(1632)(57). 
4. The permutation (7 14362)(59)( 8) is expressible as the composition of trans- 


positions. 
5. If f =x, +2x,+3x, +4x,+5x,, then (12345)f =(34512)f. 
(123)(234)f=(12)(34)f. 


There is a function of eight variables which has four variants. 


a faes 


8. There is a function of eight variables which has two variants. 


9. The 15-puzzle configuration of Figure 8.7 is attainable by legitimate moves. 


New Terms 

15-puzzle, 171 identity permutation, 158 
cycle, 161 invariant function, 156 
cyclic permutation, 161 odd permutation, 170 
discriminant, 169 parity of a permutation, 171 
disjoint cycle decomposition, 162 permutation, 158 

distinct variants, 156 transposition, 163 


even permutation, 170 variants, 155 


CHAPTER SUMMARY 179 


Figure 8.8 Legitimate moves for the cylindrical (2, 2)-puzzle 


Supplementary Exercises 


If & and n are any two positive integers, the (4, 7)-puzzle is similar to the 15-puzzle, 


except that it consists of a rectangular array of squares containing & rows and 7 columns. 


Accordingly, the (4, 4)-puzzle is identical with the 15-puzzle analyzed in this chapter. 


I. 


Given any initial configuration of the (4, 5)-puzzle, describe all the configurations 


attainable from it. 


Given any initial configuration of the (5, 5)-puzzle, describe all the configurations 


attainable from it. 


For any positive integers & and 7, given an initial configuration of the (&, 2)-puzzle, 


describe all the configurations attainable from it. 


The cylindrical (k, n)-puzzle is obtained from the traditional version by adding 
some moves that allow for the pieces to cross the vertical boundaries of the board. 
Figure 8.8 illustrates some legitimate moves for the cylindrical version of the puzzle. 
The reason for the cylindrical appellation is that it is possible to obtain a physical 
interpretation of the legitimacy of the new moves by bending the puzzle into a 
cylinder and gluing its vertical edges so that in this new form the pieces always 
do slide into adjacent spaces. Given any initial configuration of the cylindrical 


(2, n)-puzzle, decide which configurations are attainable from it. 


The toroidal (k, n)-puzzle is obtained from the cylindrical puzzle by allowing the 
pieces to slide into the empty space across the horizontal boundaries as well (Fig- 
ure 8.9). Given any initial configuration of the toroidal (&, 7)-puzzle, decide which 
configurations are attainable from it. This game can be visualized as taking place 


on a torus (Figure 8.10). 


180 


PERMUTATIONS 


Figure 8.9 Legitimate moves for the toroidal (2, 2)-puzzle 


Figure 8.10 A torus 


The Mobius (k, n)-puzzle is obtained from the traditional one by allowing for the 
filling of the blanks across the vertical boundaries, with the additional twist that 
the blank can be filled only by boundary pieces that are diametrically opposite to it 
(Figure 8.11). As its name implies, this version can be visualized on the Mobius 
strip of Figure 8.12. For any initial configuration of the Mébius (&, 2)-puzzle, 


determine the configurations that are attainable from it. 


The Klein bottle (k, n)-puzzle allows for the cylindrical moves of Exercise 8.s.4 across 
its horizontal boundaries and the Mébius-type moves of Exercise 8.s.6 across its 
vertical boundaries. For any initial configuration of the Klein bottle (4, 7)-puzzle, 
determine the configurations that are attainable from it. This game can be visualized 


(with some difficulty) on the Klein bottle of Figure 8.13. 


CHAPTER SUMMARY 181 


Figure 8.11 Examples of diametrically opposite boundary pieces 


Figure 8.12 The Mobius strip 


Figure 8.13 The Klein bottle 


182 


13. 


14. 


15. 


PERMUTATIONS 


Generalize the (&, 7)-puzzle to other planar versions. 

Formulate a 3-dimensional version of the 15-puzzle and solve it. 

Formulate a 3-dimensional version of the toroidal (&, 2)-puzzle and solve it. 
Formulate a 3-dimensional version of the Klein bottle (4, 7)-puzzle and solve it. 


Formulate and solve d -dimensional analogs of all the versions of the (&, 2)-puzzle 
for every positive integer @. 

Prove Cauchy’s Theorem that a function of 7 > 5 variables which has fewer than n 
distinct variants must have either one or two distinct variants. 

Let and & be two positive integers. Which permutations of {1,2,...,2} are 


expressible as the composition of & cyclic permutations of the same set? 


Is it true that for any three integers &, /, m > 1 there exist permutations o and o 


such that ¢, a, and oa have orders &, /, and m, respectively? 


Chapter 9 


GROUPS 


— NOW EXTRACT the notion of a group from the variety of algebraic structures that 
have been discussed in the previous chapters. This leads to a natural classification 


problem and information is provided toward the classification of some elementary groups. 


9.1 Permutation Groups 


A few years after Abel proved that the general quintic equation was not solvable by radicals 
the young Evariste Galois discovered a general criterion for determining whether any 
given equation was solvable by radicals. The only two specific examples that Galois gave 
were Abel’s aforementioned theorem and Gauss’s work on the cyclotomic equation, which 
was discussed in some detail in Chapters 2 and 3. Galois observed that permutations (he 
sometimes called them substitutions) played a crucial role in both proofs. As was seen in 
Chapter 8, these permutations are quite explicit in Abel’s work. They are less evident in 
Gauss’s proof, and the best that can be said within the confines of this book is that in the 


special case discussed in Section 2.4, the crucial permutation is 
gate ° ‘Ga Ze e. a gt zié eg ey y? ¢ gl? e? ee) 


where ¢ is any primitive 17-th root of 1. Of course, o can also be thought of as the 
algebraically defined function o(a) = a? defined on the imaginary 17-th roots of 1, but 
we shall regard it as a purely formal permutation. This permutation bears an obvious 


relation to the sum 
CHP te 72 eg I 4 ph yl 4 cll yg pI 4 lk p84 e724 fg ol2 4 72 4 76 


that was used to prove the ruler-and-compass constructibility of the regular 17-sided 


polygon. The work of Lagrange, Gauss, and Abel eventually led Galois to formulate 


Introductory Modern Algebra, Second Edition. 183 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


184 GROUPS 


the notion of a group of permutations, or permutation group. Given a set S, a group of 
permutations of S isa set G of permutations of S such that 

(a) G contains the identity permutation, 

(b) ifo isin G, so is ao, 

(c) if o and o are in G, so is their composition oc. 

We have already encountered several groups of permutations. For each positive integer 
n, the group of all the permutations of the set { 1,2,..., 7} is called the symmetric group 
and is denoted by S_. That S, satisfies the three requirements above follows from the 
fact that it consists of all of the permutations of {1,2,...,7}. 

The list of permutations Id, (1 2 ),(34),(12)( 34), which consists of all the per- 
mutations that leave the polynomial x, + x, — Xz — X, unchanged, is also a group of 
permutations. To see this it need only be observed that each of these permutations is its 
own inverse, and that the composition of any two distinct nonidentity elements is equal 


to the third nonidentity element. This is no coincidence. 


Proposition 9.1 If f is any function of x,,x,,...,x,, then the set Sf of all the 


permutations of these variables that leave f unchanged is a group of permutations. 


Proof. This follows from Lemma 8.8 and Corollary 8.9. 2 


This proposition provides us with a host of groups of permutations. Thus, if f = 
x, +x) +x; , then S,,=5,. If f = yxy xe then S3¢ = {Id}. For f =x.) +-;, 
$3, ¢ = {Id,(12) }, and finally, for 


ee an eee 
SH Hy XZ AX My xG + xP HZ, 


we have S, - = {Id,(123),(132)}. The converse of Proposition 9.1 also holds. 


Proposition 9.2 Let G be any group of permutations of { 1, 2,...,}. ‘Then there is a 


polynomial function f such that G = S,, ¢. 


Proof. Itis clear that the polynomial g = x, xj x} +++ x” is such that Sng = {ld }. Suppose 


now that G = {0,,0,,...,0, } isa listing of the elements of G. Set 


fHogtogt--+o,g. 


Since G is a group, it follows that for every o,0, € G, oa, also belongs to G. Moreover, 


G0; = 00; if and only if o, = o> if and only if i = 7, so that o0,,00,,...,00, also 


PERMUTATION GROUPS 185 


constitutes a listing of the elements of G. Hence, for any €G, 


of =00,g+00,g+-+-+00,g=0,gt0,g++-+o,g=f 


so that o € Sif Thus we have shown that Sif DG. 
Conversely, suppose a € S, ft Then o must transform each summand of f into 


another summand of f. In particular, for some j = 1,2,...,k; 


a0, g=o(0,g)=9,g oF 0; '00,g = g. 


Since Sig = {Id}, it follows that 0700, =Idandsoo= o,Ido;" = ojo," €G. Thus, 
S fe G. r 


n 


If G ={Id,(123),(132)}, chen g =x,x}x} and 


ne 2,3 2,3 
fr = XZ Xz + yx; xP + x5xP x7. 


The group of permutations G that Galois associated with the cyclotomic equation x!” — 


1 =0 consists of all the powers of the permutation 
o=(C03 img ze z'3 25 zis eH ger Fe x eA ce ¢ a ; 
Put differently G = { Id, 0,07, 0°,...,01° }, where 


o = ( ee ais ra zis ag a 2 \ 3 Ze e ch rit ¢ zl? ct) 


o'6 = Id. 
This generalizes as follows. 


Proposition 9.3. Let o be any permutation of {1,2,...,}, and let d be the order of 


a. Then the set (co) = {Id,o,07,...,07} is a group of permutations, 


Proof. The identity permutation Id belongs to (a) by definition. Since o% = Id, if 
follows that (o*)"! =0~* = 04 for  =0,1,2,...,d —1 and so every element of (c) 
has its inverse in (o). Finally, if o* and o” are two arbitrary elements of (c), then so is 


o*o™ =o'*™ an element of (c). Thus, (a) is indeed a group of permutations. 2 


186 GROUPS 


Figure 9.1 The symmetries of the rectangle 


Accordingly, 
((123)(45)) = {Id,(123)(45),(132),(45),(123),(132)(45)} 


and 


((123456))={Id,(123456),(135)(246),(14)(25)(36)} 
U{(153)(264),(165432)} 


are groups of permutations. The alternating group A, consists of the set of all the even 


permutations of 7 symbols. Thus, 
A, = {Id}, A, = {Id}, A, = {Id,(123),(132)}, 


and 


Ay = {1d,(12)(34),(13)(24),(14)(23),(123),(132)} 
U{(124),(142),(134),(143),(234),(243)}. 


The formal proof of the fact that A, is indeed a group of permutations is relegated to 
Exercise 9.1.17. 

There is a host of groups of permutations that are defined geometrically in terms of 
symmetries of configurations rather than algebraic symbols. Consider the rectangle of 
Figure 9.1 whose vertices are labeled 1, 2, 3, 4. It has two obvious symmetries, with respect 
to the x- and y-axes. The first of these interchanges the vertices 1 and 4 and also 2 and 


3. Thus it can be denoted by the permutation ( 1 4 )( 23 ). Similarly, the symmetry with 


PERMUTATION GROUPS 187 


respect to the y-axis induces the permutation (1 2 )( 3 4) on the vertices. The central 
symmetry that the rectangle has with respect to the origin induces the permutation 
(13)(24) on the vertices. Each of these three permutations is its own inverse, and the 
composition of any two distinct ones equals the third. It therefore follows that if we add 


Id as a trivial symmetry, then the set 
K = {Id,(12)(34),(13)(24),(14)(23)} 


is a group of permutations. This group is known as the Klein 4-group, after the mathe- 
matician Felix Klein whose Erlanger Programm of 1872 set the tone for the investigation 
of the relationship between geometry and group theory for generations to come. 

Before we go on to describe some more geometrical groups of permutations, it is 
necessary to firm up the notion of a symmetry. For the purposes of this discussion, the 
geometrical configurations in question are all assumed to be centered at the origin of a 
three-dimensional coordinate system, and a symmetry of the configuration is a rotation 
of the ambient space about an axis through the origin that leaves the position of the 
configuration unchanged. Thus, from our point of view, a symmetry with respect to 
the x-axis results from a 180° rotation of space about the x-axis. This rotation clearly 
transforms the rectangle of Figure 9.1 right back onto itself. The central symmetry of the 
rectangle with respect to the origin comes from a 180° rotation of space about the z-axis 
which is not drawn in the figure. It is clear that by this definition every configuration 
possesses at least one symmetry, namely, the trivial one that is defined by the identity 
transformation of space. Since each symmetry transforms the configuration onto itself, 
each such symmetry must necessarily permute the vertices of the configuration, as was 


the case in the above rectangle. We refer to these permutations as vertex symmetries. 


Proposition 9.4 The vertex symmetries of any geometrical configuration form a group 


of permutations. 


Proof. As was noted above, the identity permutation is a vertex symmetry of any con- 
figuration. Since the inverse of any rotation is also a rotation, it follows that the inverse 
of any vertex symmetry is also a vertex symmetry. Finally, since the composition of any 
two rotations whose axes intersect at the origin is known to be another such rotation it 


follows that the composition of any two vertex symmetries is also a vertex symmetry. 


We emphasize here that if o is the vertex permutation that describes the rotation R, 


then the rotation R replaces the vertex v with the vertex o(v). The square of Figure 9.2 


188 GROUPS 


Figure 9.2 The symmetries of the square 


3 


ws 


NM 


Figure 9.3 A regular tetrahedron 


has eight symmetries: the four that it has as a rectangle, two more that result from 
clockwise and counterclockwise 90° rotations about the z-axis, and two more that result 
from 180° rotations about the diagonals 13 and 24, respectively. These last four induce 
the following respective vertex permutations: (1234),(1432),(24),and(13). 


Thus, the vertex symmetries of the square constitute the permutation group 
D, = {1d,(12)(34),(13)(24),(14)(23),(1234),(1432)}( 13),(24)}. 


In general, the group of vertex symmetries of the regular 2-gon is called the dihedral 
group D. 
We turn next to some interesting vertex symmetry groups defined by solid configura- 


tions. The regular tetrahedron of Figure 9.3, with vertices 1, 2, 3, and 4 has four faces 


EXERCISES 189 


5 6 


Figure 9.4 A triangular prism. 


each of which is an equilateral triangle. Each altitude (the line joining a vertex to the 
center of the opposite face) serves as the axis of two nontrivial +120° rotations, thus 


contributing a total of eight vertex symmetries: 


(123), (132), (124), (142), 
(134), (143), (234), (243). 


In addition, the line joining the midpoints of the two edges 23 and 14 serves as the axis 
of a 180° rotation that defines the vertex symmetry (2 3 )( 1 4) with the analogous lines 
defining the additional vertex symmetries (13 )( 24) and (12)(34). 

It is now clear that the vertex symmetries of the tetrahedron constitute a by now familiar 
group, namely, the alternating group A, that consists of all the even permutations of 
{1,2,3,4}. 

This is the time to note that we have excluded some symmetries that others might, and 
sometimes do, include. Specifically, it could be argued that the tetrahedron of Figure 9.3 
possesses a symmetry with respect to the plane that contains the edge 14 and bisects the 
edge 23. The decision to restrict our attention only to symmetries that are realizable as 
rotations in three-dimensional space was arbitrary and based on pedagogical grounds. 

Consider the triangular prism of Figure 9.4, whose lateral sides are equilateral triangles. 


Its group of vertex symmetries consists of the elements 


Id, (123)(456), (132)(465), 
(15)(24)(36), (14)(26)(35), (16)(25)(3 4). 


190 GROUPS 


Figure 9.5 A cube 


Exercises 9.1 


Which of the sets of permutations of { 1, 2,3, 4,5} in Exercises 9.1.1 to 9.1.6 form a 


group? Justify your answer. 
1. All the even permutations. 
2. All the odd permutations. 
3. All the transpositions. 
4. All the permutations that leave 3 fixed. 
5. All the permutations that interchange 2 and 4. 


6. All the permutations that map 1 to 5. 


List the elements of the group S,, f for the function f in Exercises 9.1.7 to 9.1.16. 


7+ Xy +X) 42x; 12. (x, +))x3%4%5 

I Ca a a I a 13+ (x1 — xp)(%3 — 4) 

9: xy +X +X, — xX, 14. (x, + 2%) — % — X4)%s 
10. (x, + X)(x3 + x4) 15. (x, +x) — x5 — x4) 
11s bey — x) 16, x, /X)+x3/%, 


17- Prove that if f =A, then Sif =A 
18. Describe the vertex symmetries of the cube of Figure 9.5. 
19. Describe the vertex symmetries of the octahedron of Figure 9.6. 


20. Describe, without listing them, all the vertex symmetries of the dodecahedron of 


Figure 9.7. 


EXERCISES 191 


6 


Figure 9.6 A regular octahedron 


Figure 9.7 A regular dodecahedron 


21. Describe, without listing them, all the vertex symmetries of the icosahedron of 


Figure 9.8. 


Let A=(123), B=(243), C=(12)(34), and D=(13)(24) be four rota- 
tions of the tetrahedron of Figure 9.3. Find the axes and the angle of the rotations in 
Exercises 9.1.22 to 9.1.27. 

22. AoB 24. BoA 26. CoA 

23. AoC 25. CoB 27. DoC 


192 GROUPS 


Figure 9.8 A regular icosahedron 


Let A=(1234)(5678), B=(186)(247), and C=(12)(78)(46)(35) be 
three rotations of the cube of Figure 9.5. Find the axes and the angle of the rotations in 


Exercises 9.1.28 to 9.1.33. 


28. AoB 30. BoA 32. CoA 
29. AoC 31. CoB 33. BoC 


34. List the elements of the dihedral group D; as permutations. 
35- How many elements does the dihedral group D,, have? 
36. How many of the elements of the dihedral group D,, have order 2? 


37- Prove that for every positive integer & there is a polygon whose group of vertex 


symmetries contains & elements. 


9.2 Abstract Groups 


Every group of permutations resembles the earlier algebraic structures of this text, such 
as the real numbers, the complex numbers, Z, , and the Galois fields, in that it involves 
a binary operation on a set of objects, this operation being, of course, the composition 
of permutations. Thus, it is possible to associate with every group of permutations a 
multiplication table that greatly resembles the tables that were associated with Z, for 
each positive integer 7. 

Table 9.1 describes the compositions of the elements of the Klein 4-group K. There, 


as elsewhere, the product 26 is to be found at the intersection of row a and column 6. 


ABSTRACT GROUPS 193 


(12)(34) (13)(24) (14)(23) 

Id Id (12)34) (13)24) (14)(23) 

(12)(34) | (12)(34) Id (14) 23) (13)(24) 

(13)(24) | (13)(24) (14)(23) Id (12)(34) 
(14)(23) | (14)23) (13)(24) (12)34) Id 


Table 9.1 The multiplication table of the Klein 4-group K 


Similarly, the group of vertex symmetries of the square has Table 9.2 associated with 
it. It was the British mathematician Cayley who eventually extracted the notion of an 
abstract group from these tables (see Appendix E). Accordingly, an abstract group consists 
of a set G and a binary operation - on its elements such that the following four properties 
are satisfied: 

(a) for any two elements a and 6 of G, a- 6 is also in G, 

(b) there is an element 1, of G such that 2-1, =1,-a=a for every element a of 

G, 

(c) for any elements 2, b,c € G, (a-6)-c=a-(b-c), 

(d) for every element a of G there is an element a’ such that a-a4 =at-a=1 G 
Such an abstract group is denoted by either (G, -), or sometimes by G alone, if the binary 
operation is understood. The element 1¢ is called the identity of G, and the element a! 
is called the inverse of a. 

It is clear that every group of permutations is an abstract group, with composition as 
the binary operation in question. The identity permutation Id functions as the identity 
1, and o! functions as o# for every permutation o. It is also clear that the set of the real 
numbers, together with the operation of addition, constitutes a group. Here 0 functions 
as the group identity and a! = —a for all real 2. This group will be denoted by (R, +). 
Similarly (Z, +), (Q, +), and (C, +) denote groups whose underlying sets are the integers, 
the rationals, and the complex numbers, respectively. On the other hand, the integers 
under multiplication do not constitute a group, since very few integers have multiplicative 
inverses. Nor do the rational, the real, or the complex numbers constitute a group under 
multiplication, since in each case 0 fails to have a multiplicative inverse. 

We have already encountered many other groups in this book. Recalling that V1 
denotes the set of all the complex roots of 1, we note that (1/1, -) is a group wherein - 


denotes the ordinary multiplication of complex numbers. The identity element of this 


(1324) (1432) (13) (24) (14)(23) 


Id Id (1234) (13)(24) (1432) (13) (24) (1234) (14)(23) 
(1234) | (1234) (13)24) (1432) Id (1423) (12X34) (13) (24) 
(13924) | (13924) (1432) Id (1234) (24) (13) (14/23) (1234) 
(1432) | (1432) Id (1234) (1324) (12/34) (1423) (24) (13) 
(13) (13) (1234) = (24) (1423) Id (13)(24) (1234) (1432) 
(24) (24) (14/23) = (13) (1234) (1324) Id (1432) (1234) 
(1234) | (12934) (24) (1423) (13) (1432) (1234) Id (13\(24) 
(149/23) | (14923) (13) (1234) (24) (1234) (1432) (13)(24) Id 


Table 9.2 | The multiplication table of D, 


GROUPS 


194 


ABSTRACT GROUPS 195 


group is 1, the inverse ¢4 of the root & is simply ¢~!, also an element of V1, and it 
is clear that if £ and y are any two elements of ¥/1, then so is their product fy since 
(ony =n" = 1. 

For any integer , addition modulo n defines a group (Z,,+), with 0 acting as the 
group identity and a* =—a. However, when p is a prime integer, modular arithmetic 
can be used to define another, less expected, collection of groups. If p is a prime number, 
set ZF = {1,2,3,...,p—1 }. Then we know that each element of Zi’, hasa multiplica- 
tive inverse in Z,. Consequently, (Z*,-) is a group with identity 1, where - denotes 
multiplication modulo p. This collection of groups can be considerably enlarged as 
follows. For each positive integer 7 let Z* denote the set of positive integers that are both 
smaller than 7 and relatively prime to it. For example, Zi = {1,5}, Zg = {1,3,5,7}, 
and Z}, = {1,3,7,9}. Corollary 4.4 and Lemma 4.6 guarantee that (Z*,-) is indeed a 
group, where - is multiplication modulo 7. 

If F is any field, then (F, +) also forms a group. Note that if F = GF(2, P(x)), then 
a‘ = a for each a in F.. Again, if F* denotes all the nonzero elements of F,, then (F*,-) 
is also a group, where - denotes the multiplication operation in the field. 

If F is any field, then F [x], the set of polynomials with coefficients in F, is a group 
with respect to addition. Similarly, if 7 is any positive integer and F[x,<n] denotes all 
the polynomials in F [x] that have degree at most » together with the zero polynomial, 
then F[x,<x] is also a group with respect to addition. For example, Z,[x,<1] has as 
elements 0, 1, x, and 1+. 

Some groups can be defined directly by means of a multiplication table. The group 
whose multiplication table is Table 9.3 is called the Quaternion group. It is clear that 
in this multiplication table 1 functions as an identity and ai =e, 64 = f, etc. The 
direct verification of the associativity of this multiplication table calls for several hundred 
computations. More efficient techniques are available, but they fall outside the confines 
of this text. 

A group is said to be a commutative group (or abelian group ) if for any two of its 
elements a and 6 we have ab = ba. Thus, the groups (Z,,+), GF(p, P(x)), Z*, Z, 
and C are all commutative. On the other hand, if n > 3, then the group S, is not 


commutative since in each such group 


C1223 )=$(1.23)F (321) 2(23)(12): 


196 GROUPS 


gig f a lec e 


Table 9.3. The multiplication table of the Quaternion group 


A digression on the nature of multiplication tables might be in order here. It is clear 
that (the interior of) the multiplication table of a group with 2 elements consists of an 
n-by-n array. It is customary to list the rows and the columns of the array in the same 
order, with the row and column that correspond to the identity element appearing first. 
Each row and each column of the multiplication table constitutes a permutation of the 
elements of the group. A table that possesses all these properties is called a Latin square. 
Thus, the multiplication table of every group is a Latin square. The converse is not true. 
Most Latin squares do not come from groups, and Exercises 9.2.31 to 9.2.35 contain 
additional information on this subject. 

The definition of an abstract group stipulates the existence of inverses, but says nothing 
about the possible existence of multiple inverses. The next proposition shuts the door on 


this possibility. 
Proposition 9.5 If G isa group and a € G, then a has exactly one inverse in G. 


Proof: Suppose both 6 and ¢ are inverses of a in G, i.e, ba=4ab6=1, =ca=ac. Then 
6= 61, = b(ac)=(ba)e=1ge=c. 7 


‘The next proposition about inverses will prove useful later. 
Proposition 9.6 If 2 and & are elements of the group G, then (ab) = bFa’. 


Proof. Several applications of the Associative Law yield 


(b¢a*)(ab) = bi(ata)b = 1b = bb = 1, 


EXERCISES 197 


and 


(ab)( bat) = a(bb')at = al cat = aah = 1. 


Thus, b4a# acts like an inverse of ab and hence, by Proposition 9.5, it must be the inverse 


of ab, ie, bat = (aby. 7 


If a is any element of a group G, and 7 is any positive integer, then we define a” as the 


2 3 


product of n a’s. Thus, a! =a, a” = aa, a® = aaa, and so on. If we also define a° = 1, 


n 


and a~” =(a")”, then it is easily verified (Exercise 9.2.36) that, just like the powers of 


real numbers, the powers of abstract group elements satisfy the conditions a”a” = a+” 


and (a”)n =a”” for any two integers m and n. 
Exercises 9.2 


Each of Exercises 9.2.1 to 9.2.14 specifies a set and a binary operation. In which cases do 


these form a group? If not, explain why not. 
1. All the even elements of Z, g99 under addition 
2. All the odd elements of Z, o9) under addition 
3- All the even elements of Z, 9) under multiplication 
4. All the odd elements of Z, gg) under multiplication 
5- All the even elements of Z,, under multiplication 
6. All the odd elements of Z,, under multiplication 
7- All the integers under subtraction 
8. All the integers under addition 
9. All the integers under multiplication 
10. All the positive real numbers under addition 
11. All the positive real numbers under multiplication 
12. All the positive real numbers under division 
13. All the polynomials over Z, under addition 


14. All the polynomials over Z, under multiplication 


198 GROUPS 


In each of the groups in Exercises 9.2.15 to 9.2.27 pair each element with its inverse. 


15. (Z4,+) 19. Z,[x,<1] 23. (Zj,-) 
16. (Zi,-) 2005; 24. V1 

17 V1 21. (Zs,+) 25- (Ze, +) 
18. K 22. Z,[x,<2] 26. Z,[x,<1] 


27. The Quaternion group. 
28. Prove that if G is a group for which (a6)! = a‘ 6! for every pair of elements a, 6, 


then G is a commutative group. 


29. Prove that if G is a group in which every nonidentity element has order 2, then G 
is commutative. 

30. What geometrical feature characterizes the multiplication table of commutative 
groups? 


31. Explain why the Latin square below is not the multiplication table of a group. 


123 4 5 
215 3 4 
3 415 2 
45 2 1 3 
5 3 42 1 


33- 


34. 
35. 
36. 


37- 


9.3 


ISOMORPHISMS OF GROUPS AND ORDERS OF ELEMENTS 199 


c | c 6 a Id 


Table 9.4 The Klein 4-group 


2 3 
2 -3 
1/1 2 3 0 
2/2 3 0 1 
3/3 0 1 2 


Table 9.5 Addition modulo 4 


Prove that every 1-by-1 and every 2-by-2 Latin square is the multiplication table of 
a group. 

Prove that every 3-by-3 Latin square is the multiplication table of a group. 

Prove that every 4-by-4 Latin square is the multiplication table of a group. 

Prove that if @ is an element of the abstract group G, and m and 7 are arbitrary 
integers, then a”a” =a”*” and (a”)” =a”™”. 


Prove that every group has a unique identity element. 


Isomorphisms of Groups and Orders of Elements 


Since abstract groups are defined in terms of their multiplication tables, it makes sense 


to identify abstract groups that have identical tables. With this in mind, let us examine 


Tables 9.4 to 9.7. Table 9.4 is an abbreviation of the table associated with the Klein 


4-group. Table 9.5 represents (Z,,+). Table 9.6 represents a mystery group, as yet 


unidentified. Table 9.7 represents the group (F, +) where F is the Galois field GF(2, x? + 
x+1). 


Table 9.4 and Table 9.7 are readily recognized as being essentially one and the same. 


In each table the diagonal entry is the group identity, and in each table the group multi- 


plication of any two distinct nonidentity elements equals the third nonidentity element. 


200 GROUPS 


Table 9.7 Addition in GF(2, x? + x +1) 


Table 9.5 and Table 9.6 are also essentially the same. To see this it is merely necessary to 
switch the columns and rows of Table 9.6 that correspond to the elements A and B so 
that this table takes the form displayed in Table 9.8. 

When the symbols e, B, A, and C of Table 9.8 are replaced by 0, 1, 2, and 3, 
respectively, Table 9.5 is obtained, thus showing that Table 9.5 and Table 9.6 are only 
superficially different. 

Is it possible that some such switching of rows and columns and rewriting of symbols 
could transform Table 9.4 to Table 9.52 The answer is no and a reason for this can be 
found in the diagonals of these tables. Notice that the diagonal of Table 9.4 contains 
only the group identity, whereas the diagonal of Table 9.5 contains another element 
besides the identity. Now it is clear that no matter how the symbols of Table 9.4 are 
relabeled, the diagonal will always contain only that symbol that stands for the group 
identity. Moreover, even when the row and column of any element are extracted and 
moved (in a consistent manner) to a new location, the diagonal still only contains the 
group identity. Hence, Table 9.4 and Table 9.5 are different in an essential way. 

We formalize this notion of sameness with the term of isomorphism. Two groups 
(G,-) and (H, ®) are said to be isomorphic provided that their elements can be matched 


up so that when the elements of G in a table of (G, -) are replaced with the corresponding 


ISOMORPHISMS OF GROUPS AND ORDERS OF ELEMENTS 201 


Table 9.8 A rewriting of the mystery group 


Figure 9.9 The third requirement for isomorphisms 


elements of H, then the table of (G,-) is transformed into a multiplication table for 
(H, ®). In other words, the two groups (G,-) and (#7, ®) are isomorphic if there exists a 
function f : G — H such that 

(a) f(a) = f (6) ifand only if a= 6, 

(b) for every 4 ¢ H there isa g ¢ G such that f(g) =4, and 

(©) fla-b)= f(a)e f(b). 
‘The first requirement says that f assigns distinct elements of H to distinct elements of 
G. The second requirement says that every element of H is assigned to some element 
of G. When dealing with finite groups, which are our main concern here, these two 
conditions are redundant in the sense that each implies the other. For infinite groups this 


need not be the case (see Exercises 9.3.20 and 9.3.21). 


202 GROUPS 


To understand the last requirement, note that if in a multiplication table for (G, -) each 
element a is replaced by f(a), and if the result is a multiplication table for (7, ®), then 
the entry a- 6 in the a-row and 6-column will be replaced by f(a- 6). However, because 
this is now the entry in the {(a)-row and f(6)-column of the multiplication table of 
(H,@), it must also equal f(a) @ f(4) (see Figure 9.9). Hence, f(a- 6) = f(a) @ f(d). 
The function f is called an isomorphism. Contrary to its usage in elementary calculus, 
the word function is used here in its most abstract sense, that of a mere association. Thus, 


the function 
flld)=0, fla=1, f(b)=a fle)=1+e 


is an isomorphism of the group in Table 9.4 and the group in Table 9.7. 


The function 


f(ld)=0, f(aj=a, f(d)=l+a f(c)=1 


is another isomorphism of the group in Table 9.4 and the group in Table 9.7. Similarly 


gle)=0, g(A)=2, g(B)=1, g(C)=3 


is an isomorphism of the group in Table 9.6 and the group in Table 9.5. However, the 
function 
g(e)=0, g(A)=L g(B)=2, g(C)=3 


is not an isomorphism, because it violates the last requirement of isomorphisms, since 
g (A: A)= g(e)=0 but g'(A)+ g(A)=14+1F0. 

If the groups G and H are isomorphic groups (are isomorphic to each other), then 
this fact is denoted by writing G = H. Some of the more general properties are given in 
Exercise 9.3.23. 

It is clear that if two finite groups are isomorphic, then they must have the same 
number of elements. This leads us to define the order of a finite group as the number 
of its elements, so that (Z,,+) has order » and S, has order n!. In his note of 1878, 
reproduced in Appendix E, Cayley already mentions the problem of classifying all the 
isomorphism types of groups. The difficulty of this task is underscored by the fact that 
Cayley asserts erroneously that, up to isomorphism, there are three groups of order 6. It 
will be proved in Chapter 11 that every group of order 6 is isomorphic to either (Z,, +) 


or to 53. 


ISOMORPHISMS OF GROUPS AND ORDERS OF ELEMENTS 203 


One useful tool for distinguishing between nonisomorphic groups is the property of 
commutativity. It is easily verified (Exercise 9.3.19) that for m > 3 the dihedral group 
D,, is not commutative, as is the case for the Quaternion group. Thus, the commutative 
group (Z,, +) is not isomorphic to either D, or to the Quaternion group. 

Another tool for distinguishing between nonisomorphic groups is provided by the 
notion of the order of an element of a group. In analogy with the notion of the order of 
a permutation, we define the order o(a) of an element a of an abstract group G as the 
least positive integer 7 such that a” = 1,. If no such integer 7 exists, then we say that 
the element a has infinite order. 

The element d of the Quaternion group has order 2, whereas each of the elements a, 
b,c,e, f,and g has order 4. The identity element always has order 1, and it is clearly 
the only group element that can have order 1. On the other hand, the element 2 of Z 
has infinite order since 2+2+2+---+2 is never 0 = 15. 

Suppose now that the groups (G,-) and (H,@) are isomorphic. Since this is tanta- 
mount to saying that they have the same multiplication tables, it follows that correspond- 
ing elements of G and H must have the same orders. 

Returning to the Quaternion group and the dihedral group D,, the first has exactly 
one element of order 2, whereas the latter has five such elements. Consequently, these 
two groups are not isomorphic. 

Group-theoretic order enjoys the same properties as do the orders of roots of unity 
and permutations. Proposition 9.7 below is practically identical with Proposition 7.7, 


and for that reason no proof is offered. 


Proposition 9.7 Let g and / be elements of a finite group G. Then, 
(a) g”=1¢ if and only if » is a multiple of o(g); 
(b) g*=g° if and only if a—6 is a multiple of o(g); 
(c) if o(g) =n, then o(g*) = n/(k, n); 
(d) o(g4) =0(g) (A) if o(g) and o(A) are relatively prime and gh=hg. 


That the assumption g/ = hg is necessary in the last part of Proposition 9.7 can be 
seen by choosing g =(12)(34) and 4=(145). Then 


o( gh) =0((12)(34)(145))=0((13452))=5#2-3=0(g)o(A). 


This assumption is of course automatically satisfied in the context of fields wherein 


Proposition 7.7 was stated. 


204 GROUPS 


Table 9.9 A group table 


Table 9.10 A group table 


Exercises 9.3 


1. Prove that every two groups of order 2 are isomorphic to each other. 

2. Prove that every two groups of order 3 are isomorphic to each other. 

3. Are every two groups of order 4 isomorphic to each other? 

4. Prove that the groups whose multiplication tables are in Table 9.9 and Table 9.10 
are isomorphic to each other. 

5- Prove that the groups whose multiplication tables are in Table 9.10 and Table 9.11 
are isomorphic to each other. 

6. Prove that the groups whose multiplication tables are in Table 9.9 and Table 9.11 
are isomorphic to each other. 

7. Explain why the cube and the octahedron of Figure 9.5 have isomorphic vertex 


groups. 


EXERCISES 205 


% 
o 
R 


Y 


Table 9.11 A group table 


A double pyramid is formed by joining the vertices of a regular pentagon to two 
points, one directly above and one directly below the pentagon’s geometrical center. 
Explain why the group of vertex symmetries of this solid is isomorphic to the 


dihedral group D,. 


An isomorphism of a group with itself is called an automorphism. 


9. 


10. 


Il. 
12, 
13. 
14. 
15. 


16. 


17. 


18. 


19. 


Prove that the function f(x) = 3x is an automorphism of (Z, 99, +). 

Prove that if & and 7 are relatively prime positive integers, then the function 
f(x) = &x is an automorphism of (Z,, +). 

Is the function f(x) =x? an automorphism of (Z,99 +)? 

Is the function f(x) = x7! an automorphism of (Z},, -)? 

Prove that the function f(x) =x? is an automorphism of (GF(p, P(x)), +). 
Prove that the function f(x) =x? is an automorphism of (GF(p, P(x))*,-). 


For any element x of a group G let f, be the function from G into itself defined 


by f.(a)=xax7" for all ae G. Prove that f. is an automorphism of G. 


For any element x of a group G let 4, be the function from G into itself defined 


by 4, (a) = xax for all a€ G. Prove that 4, is an automorphism of G if and only 
ifx=x!, 
Prove that every group of even order contains an element of order 2. 


Let G bea finite commutative group. The exponent of G is the least common 
multiple of all orders of all the elements of G. Prove that G has an element whose 


order equals the exponent of G. 


Prove that for 7 > 3 the dihedral group D,, is not commutative. 


206 GROUPS 


20. For the group G =(Z,+), find a function f from G to G that satisfies the first 


and third conditions but not the second condition. 


21. Find a function f of the positive integers into themselves that satisfies the second 


condition for isomorphisms but does not satisfy the first condition. 
22. Prove that if a and 6 are elements of a group G, then o(a) = 0o(bad7'). 


23. Let f: G—H bean isomorphism. Prove the following: 
(a) flg)=1y 
(b) forevery ge G, f(g")=[f(g)]"- 


9.4 Subgroups and Their Orders 


A copy of the group 

K = {Id,(12)(34),(13)(24),(14)(23)}, 
consisting of the vertex symmetries of the rectangle, is clearly contained in the group 
D,= {1d.(12)(34),(13)(24),(14)(23),(1234),(1432),(13),(24)}, 


which consists of the vertex symmetries of the square. This relationship is formalized by 
the notion of a subgroup. If (G,-) is an abstract group, and if H is a subset of G such 
that (H,-) is a group in its own right, then (H,-) is said to be a subgroup of (G,-). Thus, 
(K,0) is a subgroup of (D,,0) . This is generally abbreviated to say that K is a subgroup 
of D,. Similarly, each of the groups (Z, +), (Q, +), (R. +), and (C, +) is a subgroup of 
the next. It is clear that if G is a group and H isa subset of G, then H is a subgroup of 
G if and only if the following conditions hold. 

(a) 1, isin H, 

(b) if a and 6 arein H, sois ab, 

(c) if a isin H, then so is its inverse a~!. 
Thus {0,2,4} is a subgroup of Z,. If 6 is the Galois imaginary associated with the 
irreducible polynomial x? + x? + 1 over Z, and F = GF(2, x? + x? + 1), then each of 
the following sets defines a subgroup of (F, +): {0,1,8,1+6}, {0,167,147}, and 
{0,6,6°,6+B°}. 

The alternating group A, that consists of all the even permutations of 7 symbols is a 
subgroup of the symmetric group S,, that consists of all the permutations on 7 symbols. 


‘The group of symmetries of the cube is a subgroup of the group 5, of all the permutations 


SUBGROUPS AND THEIR ORDERS 207 


of the eight vertices of the cube. The group V1 = { 1, —w, #, —1,@”, —w} contains both 
V1 ={1,-1} and V1 ={1,0,«7} as subgroups. If f is any function of x variables, 
then the group S, - that consists of all the permutations that leave f unchanged is a 
subgroup of the symmetric group S,. 

Every group contains two obvious subgroups—itself and the trivial subgroup {1¢} 
that consists of the identity element of G alone. Any other subgroup of G is said to be 
proper. 

The following theorem is quite possibly the most important theorem of group theory. 
In this form it was first stated and proved by Camille Jordan in his book Traité des 
Substitutions. Because he modestly attributed it to Lagrange who in fact had only proved 


the limited version of Corollary 9.13, this theorem nowadays bears the latter’s name. 


Theorem 9.8 (Lagrange’s Theorem) _ If G isa finite group, then the order of any subgroup 
of G is a divisor of the order of G. 


Since the proof of this theorem relies on the notion of a coset, a concept that is all but 
explicit in Lagrange’s original proof, the proof is preceded by a discussion of this concept. 
If H = {h,,h,, h;,...} is any subgroup of the group (G,-), and if a is any element of 
the original group G, we define 


a-H={a-h,a-hya-hy...} 
and call a: H a coset of H. If G =(Z,,,+) and H = {0,4,8}, then 


0+H={048}=H =4+H=8+H, 
1+H={1,5,9}=5+H=9+H, 
2+H ={2,6,10}=6+H=10+4, 
34+H={3,7,11}=7+H=11+d. 


These cosets are pictured in Figure 9.10. 


Another example begins with the polynomial f = x, + x, — x, —x, and its group 


H=S,-=Id,(12),(34),(12)(34), 


208 GROUPS 


Figure 9.10 ‘The cosets of {0, 4, 8} in Z,, 


which is by definition a subgroup of S,. Here, 


(1234)H ={(1234),(134),(123),(13)} 
=(134)H=(123)H=(13)A, 

(1243)H =(1243),(143),(124),(14)=(143)H =(124)H=(14)H, 

(1324)H ={(1324),(14)(23),(13)(24),(1423)} 


14)(23)H =(13)(24)H =(1423)H, 


(1342)H={(1342),(234),(132),(23)} 


( 
{ 
=(234)H =(132)H =(23)H, 
{ 
( 


(1432)H={(1432),(243),(142),(24)} 


=(243)H=(142)H=(24)H. 
These sets are pictured in Figure 9.11. 


‘The patterns that are indicated in the above examples hold in general. 


Proposition 9.9 Let H bea subgroup of the group G. If # is finite, then every two 
cosets of H have the same number of elements, and every two distinct cosets of H are 


in fact disjoint. 


Proof. It suffices to show that if H has m elements, then every coset of H also has 
m elements. Suppose H = {h,,h,,h3,...,4,,}. Then aH =ah,,ah,,ah3,...,ah,,. 
Moreover, if ah, = ah, , then 4, = aah, = aah, a hj» and so distinct elements of 
H give rise to distinct elements of aH. Thus H and aH contain the same number of 
elements. 

Suppose the two cosets aH and 6H share some element. In other words, suppose 
there exist 4, & ¢ H such that ah = bk, or a= bkhA~'. Since H isa group in its own right, 
all the elements of the product £47'H are back in H so that AA-'H cH and hence 


SUBGROUPS AND THEIR ORDERS 209 


{Id,(12),(34),(12)(34)} 
{(1234),(134),(123)(13)} 
{(1243),(143),(124),(14)} 

{(1324),(14)(23),(13)(24),(1423)} 
{(1342),(234),(132),(23)} 


{(1432),(243),(142),(24)} 
Figure 9.11 The cosets of {Id,(12),(34),(12)(34)} in S, 


aH = b(kh")H c 6H. Asymmetrical argument leads to the inclusion 6H ¢ aH, and 
hence we may conclude that aH = 6H. Thus, if any two cosets of H share an element, 


they must in fact be equal. In other words, distinct cosets must be disjoint. . 


Proof of Theorem 9.8. \f H is subgroup of the finite group G, then, by Proposition 9.9, 
the cosets of H in G constitute a partition of the elements of G into sets all of which 
have the same cardinality as H. Consequently, the order of H must divide the order of 
G. a 


Section 7.2 contains such a computation of cosets. Specifically, it was demonstrated 
there that if y is the Galois imaginary associated with GF(2, x4 + x? + x7 + x +1), then 
the cosets of the subgroup (y) ={Lyy2.77,1+7+7? +7°>} of the multiplicative 
group GF*(2, «4 + x3 + x? +. x +1) consist of (7) itself as well as the two sets 


{l+nrtrVtrltrtrrtrtr} 


and 


{l+y.rtyltrytyl+yley ty’. 


210 GROUPS 


Cosets also played crucial, though implicit, roles elsewhere in this book. They appear 
in the proof of Galois’s Theorem (Theorem 7.11). The sums used by Gauss in the proof 
of Theorem 2.15 are also cosets in disguise. Thus, the exponents of summands of A in 
this proof constitute the order 8 subgroup { 1,9, 13, 15, 16, 8,4,2} of Z}, and those of 
B are its only other coset. Also, the exponents of the summands of C constitute the 
subgroup { 1, 13, 16,4} of Z and those of D, E, and F are its cosets. Similarly, if z is 


any complex number, then the argument of z is in fact a coset of the subgrou 
y Pp g group 
(360°) = {...,-720°, —360°, 0°, 360°, 720°, ... } 


of R. 
The number of cosets that the subgroup H has in G is called the index of H in G 
and is denoted by [G : H]. The corollary below follows directly from Proposition 9.9, 


seeing as H = 1-H isa coset. 


Corollary 9.10 If H is a subgroup of the finite group G, then [G : H] is equal to the 
order of G divided by the order of H. 


The next corollary greatly facilitates the task of deciding when two elements belong to 
the same coset. 
Corollary 9.11 Let H be a subgroup of G and let 4 and 6 be two elements of G. 
Then the following are equivalent: 

(a) aH = 6H, 

(b) there exists an element ¢ of G such that 4,6 €cH, 

(c) abeH. 
Proof. (a) implies (b): Suppose aH = 6H. Since a=al,¢aH and 6=61,€bH= 
4H, it follows that both a and 6 belong to aH. 

(b) implies (c): Suppose there is an element ¢ of G such that a, 6 ¢ cH. In other 


words, suppose there exist 4, & € H such that a=ch and 6=ck. Then, 
a'b=(ch)\(ck)=h'etck=hked. 


(c) implies (a): Suppose a7! 6 € H, or, in other words, a~'b = for some he H. 
Then 6 = ah and so the cosets aH and 6H both contain the element 2h = 61,. By 
Proposition 9.9, aH = 6H. . 


SUBGROUPS AND THEIR ORDERS 211 


Not surprisingly, the significance of a coset depends on the meaning of both the 


ambient group and the defining subgroup. In the case of the subgroup 


H=S,,. = {Id,(12),(34),(12)(34)} 


+X, —X3—X4 


of the symmetric group G = S,, H consists of all the permutations of {1, 2, 3,4} that 
leave the polynomial f = x, + x, — x3 —x, unchanged. Here, the cosets of H turn out 
to be in a one-to-one correspondence with the distinct variants of f. Thus, the elements 
of the coset 


(1234)H ={(1234),(134),(123),(13)} 
all change f to the polynomial x, + x, —x,—~, , the elements of the coset 
(1243)H ={(1243),(143),(124),(14)} 
all change f to the polynomial x, + x,— x, — x3, etc. In general we have the following 
proposition whose verification is relegated to Exercise 9.4.58. 
Proposition 9.12 Let f be any function of the variables x,, x,,...,x,,- Then the two 
clements ¢ and o of S,, are in the same coset of S, ¢ ifand only iff =of. 
We next point out what Lagrange actually proved. 


Corollary 9.13 If f isa function of variables and m is the number of distinct variants 


of f, then m isa divisor of »!. 


Proof: According to Proposition 9.12, m=[S,: 5S, fl , so that, by Corollary 9.10, m is 


a divisor of the order of S,, which is n!. ] 


Another setting wherein the cosets of a group have an interesting interpretation is that 
of the vertex symmetries of the tetrahedron. Let G be this group of vertex symmetries, 
and let H be the subgroup that consists of all the vertex symmetries that leave the vertex 


4 unchanged. That is, 


G ={Id,(12)(34),(13)(24),(14)(23),(123),(132)} 
U{(124),(142),(134),(143),(234),(243)} 


212 GROUPS 
and H = {Id,(123),(132)}. Then the cosets of H are 


{Id,(123),(132)}, {(12)(34),(243),(143)}, 
{(13)(24),(142),(234)}, {(14)(23),(134),(124)}. 


Note that the permutations of the first coset all fix 4, the permutations of the second coset 
all transform 4 to 3, those of the third coset all transform 4 to 2, and the permutations 
of the last coset all transform 4 to 1. There is a general principle in operation here, and 


it can be found in Exercise 9.4.40. 


Exercises 9.4 


1. Find all the subgroups of (Z, , +) for m = 1, 2,..., 10. 

2. Find all the subgroups of S, for n = 1,2, 3. 

3. Show that S, contains subgroups of orders 1, 2, 3, 4, 6, 8, 12, and 24. 
4. Show that S, contains two nonisomorphic subgroups of order 4. 

5- Show that S; contains subgroups of orders 6, 8, 10, and 12. 

Find all the subgroups of (GF(2, x? + x + 1), +). 

Find all the subgroups of (GF(2, x? + x? + 1), +). 

Find all the subgroups of D,, for 2 = 3, 4,5. 


De ger BW oD) 


Find all the subgroups of the Quaternion group. 


10. Prove that A, does not have a subgroup of order 6. 


Compute the cosets of the subgroup H of G as specified in Exercises 9.4.11 to 9.4.22. 


11. G=(Zy5 +), H={0,3,6,9,12} 15. G=S,, H={Id(12)} 
12. G=(Zy5,+), H ={0,5,10} 16. G=S,, H={Id,(123),(132)} 
13. G=(Zi +), H={0,9} 17, G=A,, H=K 

14. G=(Zy4.+),H={0,6,12,18} 18. G=A,, H ={Id,(123),(132)} 


19. G is the Quaternion group, H = {1,d} 

20. G is the Quaternion group, H = {1,a,d,e} 

21. G=D,, H={1,0,0°,...,0” '}, e is counterclockwise rotation by 2x/n. 

22. G=D,,, H=1,a, @ is 180° rotation about any diagonal of the underlying 


polygon. 


EXERCISES 213 


23. For which values of & can a function of three variables have & distinct variants? 


Justify your answer. 


24. For which values of & can a function of four variables have & distinct variants? 
Justify your answer. 
25. For which of the following values of & can a function of five variables have & 


distinct variants? Justify your answer. 


(a) &=1,2,3,4,5 (b) &=10,11,12, 13,14 

26. For which values of & = 1, 2, 3, 4,5, 6, 7 can a function of 7 variables have & distinct 
variants? Justify your answer. 

27. Prove that for each positive integer m there exists a function of 7 variables that has 
(m—1)! variants. 


In Exercises 9.4.28 to 9.4.37, does S; contain a subgroup of the given order? Justify your 


answers. 
28. 50 30. 30 32. 18 34. 10 36. 6 
29. 40 31. 24 33- 15 35- 8 37-5 


38. Find a divisor d of 6! = 720 such that S, does not have a subgroup of order d. 


39. Prove that for every positive integer 7 > 1, the set A, of even permutations of 
{1,2,...,”} is a subgroup of S,. Show that A, contains exactly half the elements 
of S,,. 

40. Let G bea group of permutations of the set {1,2,...,} and let G, be the set of 
all those permutations o of G such that o(1) = 1. 
(a) Prove that G, is a subgroup of G. 
(b) Show that two elements ¢ and o of G belong to the same coset of G, in G if 
and only if o(1) = o(1). 

41. Suppose A and B are subgroups of G. Prove that AN B is also a subgroup of G. 


Is the same true for AU B? 


42. Suppose H is a subgroup of G and g is some element of G. Prove that the set 


gHg'={ghg"'|heH} isalso a subgroup of G and is isomorphic to H. 


214 GROUPS 


‘The centralizer Z, of the element a of a group G consists of the set of all the elements 


of G that commute with a. That is, Z,={x¢G|xa=ax}. 
43. Prove that for any a in G, Z, is a subgroup of G. 
44. Compute the centralizer of each element of the following groups: 


(2) (Zy+) () A, 
(b) S; (d) the Quaternion group 


The center Z(G) of the group G consists of the set of all the elements of G that commute 
with all the elements of G. That is, Z(G) ={x €G|xa=ax forallaeG}. 


45. Prove that the center of the group G is a subgroup of G. 
Find the centers of the groups in Exercises 9.4.46 to 9.4.51. 

46. (Z,,+) 48. S; 50. D, 
47- Quaternion group 49. A, 51. S 
52. Let G bea permutation group on {1,2,...,2}, and let H consist of all the even 


permutations of G. Prove that if H does not equal G, then its order equals half 
the order of G. 


53. Prove that S, does not contain a subgroup isomorphic to the Quaternion group. 


54. Prove that if 2 > 4 and p isa prime such that p <n, then S, contains no subgroup 


of index & for any & such that2<k< p. 


55- Let P(x) be a polynomial that is irreducible over Z,» where p is prime, and let 
M(x) be the minimal polynomial of some a € GF(p, P(x)). Prove that the degree 
of M(x) is a divisor of the degree of P(x). 


56. Let p bea prime number and 7 a positive integer. Prove that every group of order 


p” has an element of order p. 


57. Let f bea function of four variables that has three distinct variants. Prove that a 
certain pair of these variables can be interchanged without changing the value of 


the function. For example, if f = xy + zw, then { z, w} constitute such a pair. 
58. Prove Proposition 9.12. 
59- Prove that A, does not contain a subgroup that is isomorphic to S;,. 
60. Prove that A, does not contain a subgroup that is isomorphic to S,. 


61. Prove that the third condition for subgroups is redundant when G is finite. 


CYCLIC GROUPS AND SUBGROUPS 215 


62. Prove that if the union of two subgroups of G is a group, then one of those 


subgroups contains the other. 


9.5 Cyclic Groups and Subgroups 


If a is any element of the group G, we let (a) denote the set of all the integer powers 


m+n —m 


of a. Since a° =1¢, aa” =a™*” for all integers m and n, and (a”)# = a~”, it follows 
that (a) is a subgroup of G. The subgroup (2) is said to be generated by a. If o isa 
permutation, then there is nothing new about this notation; it was already used in the 
same sense in that more restricted context. 

If G = Z, then (2) consists of all the even integers. If G =(Z,,, +), for some positive 
integer n, then (2) = {0,2,4,...,22—2} and (”) = {0,7}. On the other hand, in 
Z,, (2) ={0,2,4,1,3} = Zs. 

This, of course, generalizes to the fact that (2) = Z,, whenever m is an odd integer. 
If ¢ is the Galois imaginary associated with the irreducible polynomial x4 + x? + 1 over 
Z,, then & is primitive so that it has order 15 in F* where F = GF(2,x4+x +1). 
Consequently, (¢°) = {1,6°,2%,0°, 217} and (2°) = {1,207,579}. 

If it so happens that @ is an element of the group G such that (a) = G, then we say 
that a is a generator of G. Thus, 1 is always a generator of (Z,, +), every odd element 
of Z, is a generator of Zg, and each of the elements @, ee eye, atid 
¢'4 is a generator of the multiplicative group F* of the Galois field above. A group 
that is generated by one of its elements is said to be a cyclic group. Thus, (Z,,+) is 
cyclic for each x since, as noted above, it has 1 as a generator. The Primitive Element 
Theorem (Theorem 7.17) asserts that the multiplicative group of the nonzero elements 
of every Galois field is cyclic. On the other hand, the additive group of every Galois 
field GF( p, P(x)) is not cyclic for vy > 2, since it has order p” and every nonzero element 
has order p. Similarly, the group S,, is not cyclic for m > 2, since every cyclic group is 
necessarily commutative. 

The cyclic groups are considered to be the simplest of all groups, and they can be 


classified in a very simple manner. 
Theorem 9.14 Every two cyclic groups of the same order are isomorphic. 


Proof. Let (G,-) and (H,,@) be two finite cyclic groups of order 7. Suppose that they 
are generated by the elements a and 4, respectively, so that, because they are finite, 
G={l¢,4,a’,a°,...,a” |} and H={1,, 6,67, 6%,...,6"1}. Then the function 


216 GROUPS 
f(a’) = 6° for each &=0,1,2,...,2—1 isan isomorphism of G and H because 
flat -a”)= flatm)= 6h” = oF @ 6" = fate f(a") 


where the exponents are added modulo 7. 


The proof of the theorem for infinite cyclic groups is relegated to Exercise 9.5.27. 


It follows from this theorem that for a fixed positive integer 7, the groups (¥/1,-), 
(Z,,+), (( 12... 2)) are all isomorphic to one another. Similarly, if P(x) is irreducible 
of degree v over Z,, then the group (GF"(p, P(x)),-) is isomorphic to (Z,v_,,+) . In 


+). 


P 
particular, (Z;, -) is isomorphic to (Z,_,, 

As was mentioned above, one of the main tasks of abstract group theory is the classifi- 
cation of all groups up to isomorphism. Since every two isomorphic groups necessarily 
have the same order, this can be rephrased as looking for the classification of all groups 
of a fixed order ». The next proposition resolves this classification problem when 7 is a 


prime number. 


Proposition 9.15 Every group of a prime order p is cyclic and is therefore isomorphic 
to (Z,, +). 


Proof. Let G bea group of order p where p is a prime integer. Let 2 be any nonidentity 
element of G. If & is the order of a, then the subgroup (2) also has order &. However, 
by Theorem 9.8, & must divide p which is prime. Since & > 1, it follows that & = p, so 
that (2)=G. a 


Clearly, then, every group of order 5 is necessarily isomorphic to (Z;, +). Curiously, 
this fact seems to have eluded Cayley in his 1878 paper. Another surprising consequence 
of this proposition is that every group of prime order is necessarily commutative, since it 


is isomorphic to a commutative group. The following consequence is also very useful. 


Proposition 9.16 If G is a group of finite order n, and if a is any element of G, then 


a” = 1... Consequently, the order of a is a divisor of the order of G. 


Proof, Let G be a group of order 7, and let a be an element of G. The subgroup (2) 
has order o(a). By Theorem 9.8, 0(a) is therefore a divisor of 7. It now follows from 


Proposition 9.7 that 2” =1¢. ] 


Proposition 9.16 yields new proofs of Fermat's Theorem (Theorem 5.15) and Galois’s 


Theorem (Theorem 7.11; see Exercise 9.5.12). 


EXERCISES 217 


We conclude this section by pointing out that the information we have obtained so 


far also allows us to classify all the groups of order 4 up to isomorphism. 


Proposition 9.17 If G is a group of order 4, then it is isomorphic to either (Z,4, +) or 
to K. 


Sketch of proof. If G has an element of order 4, then it is cyclic and hence it is isomorphic 
to (Z,, +). Otherwise, every nonidentity element of G has order 2, meaning that the 
diagonal entries of the multiplication table of G are all 1... It is now easily verified that 
the multiplication table of G must be identical with that of K. The details are relegated 


to Exercise 9.5.1. r 


With Proposition 9.17 available we have classified all the groups of order at most 5 
up to isomorphism. Those of orders 1, 2, 3, and 5 are isomorphic to (Z,,+), (Z,,+), 
(Z,,+), and (Z., +), respectively, whereas every group of order 4 is isomorphic to either 
(Z,,+) or K. A sense of the enormity of the task of classifying all finite groups can be 
obtained from the amount of theory that was required by the classification of these small 


groups alone. 


Exercises 9.5 


1. Complete the proof of Proposition 9.17. 


For each of the groups in Exercises 9.5.2 to 9.5.11, decide whether it is isomorphic to 
(Z,,+) or to K. 


2. {Id,(12),(34)(12)(34)} 7 (Zz,°) 

3- (Zé, -) 8. (GF(2,x*+x+1),+) 
4 ((1234)) 9. (Zi) 

5. ((1234)(56)) 10. V1 


6. Sz ¢ where f = x, 42x74 2x3 +4 oar, Z,|x,<1] 


12. Use Proposition 9.16 to give a new proof of Fermat’s Theorem (Theorem 5.15) and 


Galois’s Theorem (Theorem 7.11). 
13. Prove that every subgroup of every cyclic group is also cyclic. 


14. Suppose G is a group of order 187. Prove that if two subgroups of G have the 


same order, then they are isomorphic. 


218 GROUPS 


Find the largest value of & for which the groups in Exercises 9.5.15 to 9.5.23 contain a 


cyclic subgroup of order &. 


15. V1 18. Ss 21. Ary 
16. (GF(p,P(x))+) 19 A, 22. (Zip,°) 
17, Dy 20. Si, 23. (Zi,°) 


24. Prove that a group has exactly one subgroup if and only if it is isomorphic to 
(Z,, +). 

25. Prove that a group has exactly two subgroups if and only if it is isomorphic to 
(Z,,+) for some prime p. 

26. Prove that a group has exactly three subgroups if and only if it is isomorphic to 


(Z 


ines +) for some prime p. 


27. Complete the proof of Theorem 9.14 (the infinite case). 
28. Prove that if m is relatively prime to 7, then m%”) = 1 (mod 2) where ~(7) is the 


Euler ¢ function. 


9.6 Cayley's Theorem 


‘The first groups to be examined by mathematicians were groups of permutations. It was 
not until a century had past that Cayley pointed out that every group is determined 
up to isomorphism by its multiplication table, and that therefore this table could be 
used to define the notion of an abstract group. At the same time Cayley noted that this 
innovation did not introduce any genuinely new structures into the study of groups, for, 
he said, every abstract group can be shown to be isomorphic to a group of permutations. 
Cayley did not formally prove this assertion; he contented himself with an example. His 
short note on the subject is included as Appendix E. Cayley’s assertion will be formally 
stated and proved below as Theorem 9.18, but we first paraphrase Cayley’s ideas in more 
modern terminology. Table 9.12 contains the multiplication table of the symmetric group 
5, with a=(12), 6=(321),c=(13),d=(123), and e=(23). 

Reverting to our original notation for permutations, we associate with each element x 


a two-rowed array P. whose first row is “Ida b cd e” and whose second row is that row 


CAYLEY'S THEOREM 219 


Table 9.12 The multiplication table of S, 


of Table 9.12 that corresponds to the element x. Thus, 


Pe Rade aged re Idabcde 
MW \Idabecdel’ 4 "\alde bed)’ 
Bet Idabcde pre Idabcde 
’“\ bedaldc!’ ““\edeldably’ 
p= Ida bcde re Idabcde 
4"\ declde bal’ ‘“"“\ebadcld)' 


Note that 

PPRx Ida bcde Idabcde\ (Idabecde _pep 
wa \aldcbed dcldeba}] \ebadcld] °° “@ 
and 

Ppe= Ida bcde Idabcde _[Idabecde ~pup 
4°" \ dcldeba ebadcld}) \aldcbed} * % 
In other words, the function that assigns to each element x the corresponding permutation 


P. behaves just like an isomorphism. Ic is in fact always an isomorphism, and that is the 


gist of Cayley’s assertion. 
Theorem 9.18 (Cayley) Every group is isomorphic to a group of permutations. 


Proof. Let G be any group. To each element x of G we assign a permutation P. of the 
elements of G which transforms each element a to xa, i.e., P.(a)= xa forall x,aeG. 


The function P. is a permutation because xa = xb if and only if a = 6, and because 


220 GROUPS 


P(x7ly) = y. Let H be the set of permutations thus obtained, ie. H={P.|x¢G}. 
To see that H constitutes a group of permutations we observe first that for any two 


elements P. and a of H we have 


(PoP (a) = P.(P(a)) = P(ya) = xya = P.,(a) 


xy 


so that PoP, = P.,. Hence the composition of any two elements of 1 is in H, and the 
inverse of P. is P. which is also in H. Thus, H is a group of permutations. 


Suppose x and y are distinct elements of G. Then 
P (Id) = xId = x # y = yld = P,(Id) 


and so P. and P, are distinct elements of H. Thus the function f(x) =P. matches up 


all the elements of G and H. That f is in fact an isomorphism follows from 


F(xy)= 2, =f,0F, = fe)ofG). 7 


Exercises 9.6 


1. Write out the Cayley representation P. for every element x of (Z,, +). 

2. Write out the Cayley representation P. for every element x of (Z;, +). 

3. Write out the Cayley representation P. for every element x of (Z,, +). 

4. Write out the Cayley representation P. for every element x of K. 

5- Write out the Cayley representation P. for every element x of (Z,, +). 

6. Write out the Cayley representation P. for every element x of the Quaternion 

group. 
7. Write out the Cayley representation P. for every element x of (Z,, +). 


8. Prove that for any finite group G and for any element x of G, all the cycles in the 
disjoint cycle representation of P. have the same length. Explain why this common 


cycle length is necessarily a divisor of the order of G. 


9. A set with a binary operation whose multiplication table is a Latin square is called 
a loop. The definition of the permutation P. applies to loops as well as to groups. 
Prove that a loop X is a group if the set of permutations P. with x ¢ X is a group 


(under composition). 


CHAPTER SUMMARY 221 


Chapter Summary 


We have defined groups, both concretely, as permutations, and abstractly, in terms of 
axioms. Many of the algebraic systems examined in the earlier chapters are examples of 
such groups and this leads to a natural classification problem. The notion of isomorphism 
was defined to formalize the notion of “sameness” of groups. Some inroads were made 
on this difficult, if not impossible, classification problem by recognizing cyclic groups 
and using them to show that any two groups of the same prime order are necessarily 
isomorphic. Finally, we presented Cayley’s Theorem which asserts that every abstract 


group is in fact isomorphic to some group of permutations. 
Chapter Review Exercises 


Mark the following true or false. 
1. The set of permutations that leave the function (x, + x,)(x, + x,) unchanged is a 
group. 
2. Every permutation belongs to some group. 
3- There is no function f such that S, - = D,,. 
4. No group of permutations is an abstract group. 
5. V1 is an abstract group. 
6. Vlisa permutation group. 
7 The inverse of (12345)(6789) is(54321)(9876). 
8. Every two groups of order 4 are isomorphic. 
9. If the element a of the group G has order 24, then o(a”°) = 6. 
10. 5S, has a subgroup of order 6. 
11. 5, has a subgroup of order 7. 
12. If H isa subgroup of G, then every coset of H in G is also a subgroup of G. 
13. Ifa subgroup of S, has order 7, then its index is (n—1)!. 
14. The permutations (1 2 ) and (13) belong to the same coset of A, in S,. 


15. Let f =x,x,+x,+x,+x,. Then the permutations (13 )(245) and(14523) 


belong to the same coset of Ss f 
16. There is a function of ro variables that has 11 distinct variants. 


17. Every two groups of order 17 are isomorphic. 


222 


GROUPS 


18. Sj, has no element of order 11. 
19. Given any three groups of order 4, some two of them are isomorphic. 
20. The group (Z,.,+) is isomorphic to some group of permutations. 
New Terms 
abstract group, 193 index, 210 
alternating group, 186 isomorphic groups, 202 
automorphism, 205 isomorphism, 202 
center, 214 Klein 4-group, 187 
centralizer, 214 Latin square, 196 
commutative group, 195 Quaternion group, 195 
coset, 207 subgroup, 206 
cyclic group, 215 symmetric group, 184 
dihedral group, 188 trivial subgroup, 207 
group of permutations, 184 vertex symmetries, 187 
Supplementary Exercises 
1. Write a computer script which will decide whether or not a given multiplication 
table is a Latin square. 
2. Write a computer script which will decide whether or not a given multiplication 
table describes a group. 
3. Write a computer script that will list all the subgroups of a group that is given by 
its multiplication table. 
4. Write a computer script that will list all the subgroups of S,, for as many values of 
n as possible. 
5+ Classify the subgroups of S, by isomorphism type for as many values of 7 as 


possible. 


6. 


10. 


CHAPTER SUMMARY 223 


Decide whether the groups whose multiplication tables appear below are isomor- 


phic. 


Characterize all the groups that have exactly four subgroups. 
Characterize all the groups that have exactly five subgroups. 
Write a computer program that will list all the cosets of a subgroup of a group. 


For each positive integer ”, what are the 2-dimensional analogs of the five three- 


dimensional regular solids? Describe their vertex symmetry groups. 


Chapter’ 10 


QUOTIENT GROUPS AND THEIR USES 


WILL DEFINE AN OPERATION ON GROUPS that is in some ways analogous to di- 
vision of integers. This operation will provide us with a rigorous method for 
constructing the Galois fields of Chapter 7 as well as a host of new fields. Finally, Galois’s 


theorem on the resolvability of algebraic equations will be described. 


10.1 Quotient Groups 


The notion of a set as a point is one of the recurrent themes of modern mathematics. 
Loosely speaking, the idea is to create new structures from old ones by considering sets 
of elements of the old structure as elements of the new structure. In the context of group 
theory this process takes the following form. Let (G,-) be a group, and let S and T be 


any two subsets of G. We define 
S:-T={a-6|aeS and beT} 


and S"!={a'|aeS}. 
For example, if G =(Z,, +), S={1,3,5}, and T = {3,7}, then 


S+T ={143,147,343,34+7,543,54+7} = {48,6 18,3} = {13,468} 


and 
S7! = {-1,-3,-5} = {4,68}. 


Given a group (G,-), this process defines a binary operation on all the subsets of G. 
The associativity of the operation - on the elements of G entails its associativity as an 


operation on sets as well (Exercise 10.1.35). Let us examine this operation on some 


Introductory Modern Algebra, Second Edition. 225 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


226 QUOTIENT GROUPS AND THEIR USES 


Ay Ay A, 
Ay | Hy A, A, 
A, | H, H, Hy 
H,| H, Hy A, 


Table 10.1 A quotient group of (Z,,, +) 


collections of cosets. Suppose (G,-) =(Z,,,+), 1 is the subgroup { 0,3, 6,9}, and let 
the cosets of H be labeled 


Hy= A ={0,3,6,9}; 
H, =1+H ={1,4,7,10}, 


H,=24+H ={2,5,8,11}. 
Then, for example, 


H, +H, ={14+2,14+5,1+8,14+11,4+2,44+5,44+84+11} 
U{7+2,7+5,7+8,7411, 1042, 10+5,10+8, 10+11} 
= {3,6,9, 0,6, 9, 0, 3, 9, 0, 3, 6, 0, 3, 6,9} 
= {0,3,6,9} =A). 
In fact, the sum of any two of the cosets of this subgroup H is again a coset of H. 
Table 10.1 summarizes the results of the operation of addition on the cosets of H and 
makes it clear that the cosets of H form a new group that is isomorphic to Z,. 


Let us examine the Quaternion group G of Table 9.3. Since d? = 1, it follows that 


H ={1,d} isa subgroup of G. Its cosets are 


1\H=dH={1,d4}, aH =eH ={a,e}, 
bH=fH={b,f}, cH=gH={c,g}. 


Note that 


(aH\(bH)={ab,af,ebefl={ag gclh={og}=cH. 


QUOTIENT GROUPS 227 


Table 10.2 A quotient group of the Quaternion group 


Table 10.2 displays the result of multiplying any two of the cosets of the subgroup 
H ={1,d } of the Quaternion group. It is clear that this multiplication table (Table 10.2) 
is isomorphic with that of the Klein 4-group of Table 9.4. We hasten to point out that 
the cosets of a subgroup do not always form a group. For example, if G = S, and 
H ={lId,(12)}, then (13)H ={(13),(123)}, but 


[(.1 3 HC 3H] = {( 13 13)(13 (123), (1 23K 13),(123(123)} 
= {Id,(12),(23),(132)} 


which is clearly not a coset of H. This counterexample notwithstanding, the previous 
two examples indicate that the cosets of a subgroup form a group of their own often 
enough for this phenomenon to merit attention. The following lemma acknowledges the 


fact that in one very special case, the product of two cosets is necessarily a coset. 


Lemma 10.1 If H is any subgroup of G, then HH =H. 


Proof. Since H is a subgroup, the product of any two of its elements is in H and hence, 
HH CH. On the other hand, since 1, € H, it also follows that HH > H1, =H. 
Hence, HH =H. . 


Let us examine the question of just when the product of two cosets is necessarily a 
coset. In order for the two cosets aH and 6H of the subgroup H to have another coset, 
say cH, as their product, this coset cH would have to contain the element a, since a is 
in aH and 6 isin 6H. Hence, since the coset that contains 26 is ab H, we would have 
(aH)(6H)=cH =a6H, or, upon multiplying both sides by 6-'a!, 6'HbH =H. 
Hence for any Fe H, 


b'hb=b'hbloeb'HbH =H. 


228 QUOTIENT GROUPS AND THEIR USES 


Consequently, if the product of every pair of cosets is again a coset, then 67'hd € H for 
all be Gand he H. 

This motivates the following definition. A subgroup H of G is said to be a normal 
subgroup if for every b < G and every he H, 6-'hb eH. Thus, every subgroup of a 


commutative group G is normal since in such groups 
b'hb=hb'b=heH. 
If G is the Quaternion group, and if H = {1¢,d }, then clearly 
cg eae aa ea Oe 


for every element x ¢ G. Moreover, since the column and the row of d in Table 9.3 are 
identical, it follows that for each element x ¢ G, dx = xd and hence x7!dx =d for all 
such x. Thus { 1, d } is a normal subgroup of the Quaternion group. 

It is clear that a subgroup H of G is normal if and only if 6-'H6 c H for every 
element 6 of G. 
Theorem 10.2 Let H bea subgroup of G. Then the following are equivalent: 

(a) H isa normal subgroup of G; 

(b) x7'Hx =H for every element x of G; 

(c) xHx7! =H for every element x of G; 

(d) Hx =xH for every element x of G; 

(ce) (xH)(yH)=xyH forall x, y eG; 

(f) the multiplication of the cosets of H forms a group . 
Proof: (a) => (b). If H is a normal subgroup of G, then for every element x of G, 
x! Hx € H. Consequently, if x is any element of G, two applications of the definition 


of normality yield 
Hox Hx> x" [ae |e =1,H1,=H. 


Consequently, x"! Hx = H. 
(b) => (c). Replacing x by x7! transforms the expression x7'Hx to xHx7!. 
(c) => (d). If xHx7! =H, then xH =xH(x7!x)=(xHx"!)x = Hx. 
(d) => (e). Suppose x, y ¢ G. Since Hy = yH we get 


(xA\(yH)=x(Hy)H =x(yH)H =xyHH =xyH. 


QUOTIENT GROUPS 229 


(ce) => (f). Let H bea subgroup of G such that (xH)(yH) =x yH holds for every 


two of its cosets. Then 
(xH\1,H)=xloH =xH =1,*H =(1¢H)(xH) 


so that the coset H = 1-H acts as the identity for this multiplication of cosets. In 


addition, for every coset x H we have 
GIG") a00'A =1 He as x =(x" Aj\ee). 


In other words, the coset x~!H is the inverse of the coset x H. As was noted above, the 
multiplication of subsets of a group is always associative. It follows that the multiplication 
of the cosets of H is indeed a group. 

(f) ==> (a). This was already proved as part of the argument that motivated the 


definition of normal subgroups. r 


It would be useful to have a quick method for deciding whether a given subgroup is in 
fact normal. Unfortunately, no such method is known. There is, however, a variety of 
helpful ad hoc techniques. It is clear from the definition of normality that both the whole 
group G and the trivial subgroup {1,, } are normal subgroups of G. More interestingly, 


we have the following observation. 


Proposition 10.3 If H is a subgroup of index 2 in the group G, then H is a normal 
subgroup of G. 


Proof. Assume for a contradiction that xhx7! ¢ H for some x ¢ G and he H. Then 
x ¢ H, so that xhx7! €xH and there exists an element & in H such that xhx7! = xk. 
But then 

x =(xh)yl(xk)= hx xk =h ke H, 


contradicting the fact that x « H. . 


The alternating group A,, which consists of all the even permutations in S, has twelve 
elements and it therefore contains exactly half the elements of the symmetric group S,. 
It follows that A, is a normal subgroup of S, which has only two cosets: itself and its 
complement in S;. 

If H is a subgroup of G such that no other subgroup of G has the same order as H, 


then 4 is necessarily a normal subgroup of G (Exercise 10.1.21). Thus, since d is the 


230 QUOTIENT GROUPS AND THEIR USES 


H, = {0,3,6,9} H, ={1,4,7,10} H, = {2,5,8, 11} 


Figure 10.1 The elements of (Z,,,+)/{ 0, 3,6,9} 


only element of the Quaternion group that has order 2, it follows that { 1, d } is the only 
subgroup of the Quaternion group that has order 2. It therefore is a normal subgroup of 
G. 

If G is a commutative group, then for any subgroup H and for any element a of 
G, a 'Ha=Ha-'a=H, and so H is necessarily normal. This situation arises often 


enough to merit highlighting. 
Proposition 10.4 Every subgroup of a commutative group is normal. 


The converse to this proposition is false. The Quaternion group is an example of a 
noncommutative group all of whose subgroups are normal (Exercise 10.1.12). 

In the special case where G is a permutation group, Exercise 8.2.23 provides an 
occasionally useful criterion for recognizing normal subgroups. According to this exercise, 
if o and o are any two permutations in S,, then gap™! and a have the same number of 


k-cycles for each positive integer &. Consequently, the Klein 4-group 
K = {Id,(12)(34),(13)(24),(14)(23)} 


is a normal subgroup of A,, since K consists of the identity and all three of the elements 
of A, that consist of exactly two 2-cycles and nothing else. 

The subgroup H = {Id,(12)} of the symmetric group $, is not normal since for 
x =(23) we have 


xHx7! =(23){Id,(12)}(23)={Id,(13)} 4H. 


If H is a normal subgroup of G, then the group formed by the cosets of H in G is called 
the quotient of G by H and is denoted by G/H. Each element of G/H is a coset of H 
in G and so it is a subset of G. Figures 10.1 and 10.2 illustrate this relationship for the 
examples displayed in Tables 10.1 and 10.2. 

The nature of the quotient group G/# is also not easily determined, and again only 
ad hoc arguments are available. In some cases, however, the quotient group G/H is easily 
identified. If H = G, then H has only one coset in G, namely itself; and so G/H is 


QUOTIENT GROUPS 231 


Figure 10.2 ‘The elements of the Quaternion group/{ 1, d } 


necessarily the trivial group with one element only. At the other end of the spectrum we 
have the possibility that H = { 1, }. In this case, each coset of H consists of exactly one 
element of G, and so, in view of Theorem 10.2, G/H is isomorphic to G. A somewhat 
less trivial, though still surprisingly useful observation deals with subgroups of index 2, 
namely subgroups H that have exactly two cosets. As pointed out in Proposition 10.3, 
such subgroups are necessarily normal. Moreover, as they determine only two cosets, their 
quotients contain only two elements, and so they are isomorphic to (Z,, +). Accordingly, 
5,/A, is isomorphic to (Z,, +). 

Similarly, if H is a normal subgroup of index 3 in G, then G/H contains three 
elements and so it must be isomorphic to (Z,, +). Since it was pointed out above that 
K is a normal subgroup of A, and since K has index 3 in A,, it follows that A,/K 
isomorphic to (Z, +). 


We offer two observations that can be helpful in identifying quotient groups. 


Proposition 10.5 Let H bea normal subgroup of the group G. Then the order of aH 
in G/H isa divisor of the order of a in G. 


Proof. Suppose a has order m in G. Then 
(aH)” =(aH\(aH)---(aH)=a"H =1,H =H. 


Since H is the identity element of the quotient group G/H,, it follows from Proposi- 


tion 9.7 that the order of aH in G/H isa divisor of m. a 


Consider the group (F,+) where F = GF(2,x*+x+1). This group has order 16 
and is commutative. If 8 is the Galois imaginary that is associated with this field, then 
H ={0,1,6,1+6} isa subgroup of order 4. Since (F, +) is commutative, this subgroup 
is normal. Thus, F/H is a group of order 4. Since every nonidentity element of F has 
order 2, it follows from the above proposition that every nonidentity element of F /H 


also has order 2. Hence, F /H is isomorphic to the Klein 4-group. 


Proposition 10.6 If G is acyclic group and H is a subgroup of G, then G/H is also 


cyclic. 


232 QUOTIENT GROUPS AND THEIR USES 


Proof, Let g be a generator of G. Since the typical element of G has the form g*, it 
follows that the typical element of G/H has the form g*H =(gH)*. Hence, G/H is 
generated by the coset gH, and so G/H is necessarily cyclic. . 


Consider the group (F*,-) where F = GF(3, x” + x + 2) which was discussed in detail 
in Section 7.1. This is a commutative group of order 8. If & is the Galois imaginary 
associated with this field, then we already know that 4 is primitive so that 6 is a generator 
of (F*,-). Thus, (F*,-) is a cyclic group. This group has H = { 1,64} as a subgroup, 
which is necessarily normal. Since F* was cyclic, so is F*/H. Since the quotient F*/H 
has order 4, it follows that F*/H is isomorphic to (Z,, +). 

The quotient method for constructing groups provides us with a new perspective on 
modulo 7 arithmetic. 

Corollary 10.7 Let nZ denote the subgroup of the group of integers Z that is generated 
by the integer 7. Then the quotient group Z/Z is isomorphic to (Z,,, +). 
Proof. By definition, 

nZ = {0, +n, £2n, £3n,...}, 


and hence the cosets of H are H,1+H,2+H,...,n—1+H. Since, by Proposition 10.6 
Z/nZ is cyclic, it follows from Proposition 9.15 that Z/nZ is isomorphic to (Z,,+). = 


This corollary tells us that another way to define Z, is to view each of its elements as a 
coset of nZ in Z. Thus, the element 2 in Z, is to be regarded as a shorthand notation 
for the coset 


245Z.= § 0.58; =3,2;7:12,..} 


of 5Z in Z. 

However, the elements of Z,, are also subject to multiplication, and it behooves us to 
verify that this multiplication is consistent with the coset point of view. To do this we 
define the product of the two cosets a+ ”Z and 6+ nZ in Z, as the coset a6 + nZ. 
There is a potential problem with this glib definition. Note that 2+5Z=7+5Z and 
44+5Z=9+5Z. Is the product of these two cosets to be 2-4+5Z, 2:9+5Z, 7-44+5Z, 


or 7-9+5Z? Fortunately, this question is moot since 
2-4=2-9=7-4=7-9 (mod 5) 


and so 


2-44+5Z=2-94+5Z=7-44+5Z=7-9+5Z. 


EXERCISES 233 


This is the case in general as well. For, if both a—a’ and b— 6’ are divisible by 7, then 


ab—a'b' = b(a—a’)+a'(b—-b') 


is also divisible by 7, and hence 26 + nZ=a'b’+nZ. 


Exercises 10.1 


For each of the pairs G, H in Exercises 10.1.1 to 10.1.11, compute the cosets of H in G. 


Decide whether H is a normal subgroup of G. If it is normal, identify the isomorphism 


type of the quotient group G/H. If H is not a normal subgroup of G, explain why not. 


I. 
2. 
3. 
4: 
5. 
6. 


G=(Z,.+), H =(4) 7 G=V1,H={1,-1} 
G =(Z,,+), H = (3) 8. G=V/1, H={1,0,07} 


=(Z,5,+), H =(5) 9. G=A,, H=((123)) 


= ((12)) Io. G =(Zi,-), H=(2) 
=S,, H=((123)) i. G=(Zi,-), H=(7) 


For each of the groups G in Exercises 10.1.12 to 10.1.20 determine all the nontrivial 
normal subgroups H of G and identify G/H for such subgroup H. 


12. 
13. 
14. 
15. 
16. 


21. 


22. 


23. 


the Quaternion group 17. (GF*(3, x? +x +2),-) 

fie 18. (GF(5, x* +4x +2), +) 

(GF(2, x? + x +1), +) 

(GF(2, x? + x? + 1), +) es 

(GF(3, x? + x + 2), +) 20. D, 

Suppose that 7 is a subgroup of G such that no other subgroup of G is isomorphic 


to H. Prove that H is a normal subgroup of G. 


Suppose a is an element of the group G such that no other element has the same 


order as a. Prove that a has order 1 or 2, and that (a) is a normal subgroup of G. 


Prove that the center Z(G) of the group G is a normal subgroup of G. 


234 QUOTIENT GROUPS AND THEIR USES 


The element a of the group G is said to be conjugate to the element 6 if there exists an 
element x of G such that xax~! = 6. The set C(a) consists of the set of all the elements 


of G that are conjugate to a and is called the conjugacy class of a. 
24. Prove that a is conjugate to 6 if and only if 5 is conjugate to a. 
25. Prove that if a is conjugate to 6 and 6 is conjugate to c, then a is conjugate to ¢. 
26. Prove that if a and 6 are conjugate, then C(a)= C(8). 
27. Prove that if a and 6 are any two elements of a group G, then ab and ba are 
conjugate. 
28. Describe the conjugacy class of each element of S,. 
29. Describe the conjugacy class of each element of S._. 


30. Prove that the number of elements in C(a) equals [G : Z,] and is therefore a divisor 


of the order of G whenever G is finite. 


31. Prove that if p is a prime number, then every group of order p”, n > 0, has a 


nontrivial center. 
32. Prove that if H is a normal subgroup of G, then (x)! =x7'H for all x in G. 
33. Suppose H and K are normal subgroups of G such that HN K = {1,}. Prove 


that ab = ba whenever a¢ H and be K. 

34. Prove that A, is a normal subgroup of S, and that S,/A, =(Z,,+). 

35- Let (G,-) be any group. Prove that if A, B, and C are any subsets of G, then 
A-(B-C)=(A-B)-C. 

36. Prove that the intersection of two normal subgroups of G is a normal subgroup of 


G. 


10.2 Group Homomorphisms 


Eventually it became convenient to express the properties of groups in terms of functions. 
Given two groups (G,-) and (H,®) and a function f: G > H, f is said to be an 
isomorphism of these two groups (see Section 9.3) provided it satisfies the following two 
requirements: 

« f isa bijection of G and H; 

« f(a-6)= f(a)e f(8) forall a, beG. 
A function f : G > H is said to be a homomorphism of G into H if it satisfies the second 


requirement (but not necessarily the first). We now offer a list of examples. 


GROUP HOMOMORPHISMS 235 


+ Even Odd 


Table 10.3. A multiplication table 


1. The function f(a) = 2a is a homomorphism (Z, +) — (Z, +) because f(a + 6) = 
2(at 6)=2a+2b = f(a)+ fd). 


2. The function 
aye ‘ if a is even; 


1 ifais odd 


is a homomorphism (Z, +) — (Z,, +) as can be verified by means of Table 10.3 which 
summarizes the multiplication table of f. 
3. The function f(a) = 2a is a homomorphism (Z,,+)— (Z,,+). This is easily 


verified by the following observations: 


f(0 +0) = f(0)=0=0+0= f(0)+ F(0); 
fO+1)= fl) =2=04+2= f0)+ FL) 
F(1+0)= f(1)=2=24+0= f(1) + f(0) 
f(L+1)= f(2)=0=0+0= f(1)+ (1). 


In fact, if m is a positive integer, then the function f(a) = ma is a homomorphism 


(Z, +) — (Z, +) because 


f,(a+ 6) = mat b) = ma+ mb = f,(a)+ f,(d). 


A homomorphism f : G > H is said to be injective or a monomorphism provided the 
function f : G — 7 is injective. It is clear that the functions described in the first and 
third examples above are, in fact, monomorphisms, whereas the function of the second 
example is not. 

4. If (A,-) is a subgroup of (G,-), then the inclusion function defined by i{a) = 
aé€A for all a A is also a homomorphism since for all 4,6 € A, i(at+ b)=atb= 


236 QUOTIENT GROUPS AND THEIR USES 


i(a)+ i(6). This shows that the inclusion functions Zc QC Rc C are all (injective) 
homomorphisms. 
5. If (G,-) and (H, @) are any two groups, then the function f : G — H, defined by 


f(a) =1,, for all 2 € G is a homomorphism because 


f(a-b)=1,=1y@1y=f(a)@ f(b). 


6. Consider the function f : Z, — Z, defined by f(a) = 2a. This is a homomorphism 


because for any two elements a, b € Z, 
flat 6)=2(at+ 6)=2a+2b= f(a)+ f(b) (mod 4). 


Note that the codomain (or image) of this f is the proper subset {0,2} of Z,. 


7. If (G,-)=(S,,0) and o is any permutation in S,,, then the function defined as 


0 if is even; 


1 ifoisodd 


flo)= 


is a homomorphism. This follows from the definition of the parity of a permutation as the 
parity of the number of factors in any expression of the permutation as the composition 
of transpositions. The numerical value f(c) is called the signature of o and is denoted 
by sign(o). 

8. Let » be any fixed positive integer in Z. Let T denote the unit circle group under 
multiplication. The function f, :(Z,+)— T defined as 


2k 2k 
F,(&) = cos — + isin — 
n n 


is a homomorphism because 


2n(k+m) |. 2n(k+m) 
————_ + isin ———— 


n 


2nk 2nk 2am 2am 
= (cos isin). (co + isin )= 6+ £m 
n n n 


S,(& + m) = cos 


n 


The transition from the second equation above to the third one is not magic, it is 


De Moivre’s Theorem. 


GROUP HOMOMORPHISMS 237 


Lemma 10.8 Let f :(G,-)—> (H,) be a homomorphism. Then f(1,)=1,, and 
f(a)=(f(a))"! for all ae G. 
Proof. Observe that 


f(lg)= f(g: 16) = f(g) ® fig) =(f(g)? 


and hence 
ly =f(lg(fUe))' = Fe PF Ue)" = fg) 


proving the first equation. 


Let a be any element of G; then 
ly =f(1g)=f(a-a") = f(a)@ f(a") 
so that f(a!) =(f(a))'. a 


Let f: G—H bea group homomorphism. The kernel of f is denoted by Ker f 
and is defined as 


Kerf =f"(1y)={geG | f(g)=1,}- 


It is clear that in examples 1, 3, 4, and 8 above the kernel of f is a trivial group consisting 
of 1, alone. However, in example 6 the kernel equals {0,2}. In example 4 the kernel 
of f is the set {0,2}. The kernel for example 7 is the alternating group A,. In Example 


6 the kernel of f, is the infinite set 
{ cos2nmn +isin2zmn | meZ } : 


The examples above indicate that Ker f is a subgroup of G. In fact more is true. 


Proposition 10.9 (First Isomorphism Theorem) Let f : G— H bea group homomor- 
phism. Then 

(a) Ker f is a normal subgroup of G; 

(b) Im f is a subgroup of H; 

(c) Im f is isomorphic to the quotient group G/ Ker f. 


238 QUOTIENT GROUPS AND THEIR USES 


Proof. For (a), let a and 6 be elements of Ker f. Then 


f(a: 6)=flaje f(a)=1, ly =ly 


implying that a- 6 € Ker f, which proves that the kernel is closed with respect to the 


group operations in G. 


1 


Now let ¢ be any element of Ker f. Then c~ exists in G. Hence 


Vy = fle) @(f(e)! = ly @ fle") = Fc") 


and hence c7! € Ker f. 
Finally we address the normality issue. By Theorem 10.2 it suffices to show that for 


every element c of G, 


c!-Ker f-¢ c Ker f. 


For this it suffices to show that for any such ¢ and any x € Ker f 
co!-x-ceKerf. 
This, however, is clear since 


fet +x-c)= flee f(x) ® f(c) = (FO) @1y @ F(c) = (FO) @ Fe) = Ly. 


For (b), see Exercise 10.2.19. 

For (c), let K = Ker f. Since K = Ker f is now known to be a normal subgroup of 
G, the cosets of K in G can be written down as {aK = Ka|ae¢G}. Keep in mind 
that aK = bK if and only ifa'be K. Let gy: G/K — H be defined via o(aK) = f(a). 
This function ¢ is well defined, for, if a and 6 are elements of the same cosets of K , then 


there exists an element & such that 6 = ak. Consequently 
dK) = f(b) = f (ak) = 9(akK) = olaK). 
Moreover, for any two cosets aK and 6K 
oak bK)=(abK)= flab) = f(a) f(b) = oak )o( OK) 


so that 9: G/K — H isa homomorphism. 


GROUP HOMOMORPHISMS 239 


If 4 is an arbitrary element of H, then, because f is surjective, it follows that there 
exists an element a € G such that f(g)=4. Then 9(gK) = f(g) = 4, which implies 
that @ is surjective. 

Finally, for any elements 2,5 in G g(aK) = (6K) implies that f(a) = f(6), or 
[f£(6)|"' f(a) = 1,,, or f(b7'a) = f(b) f (a) = 1,4, so 6"'a eK, which means that 


a and 6 are in the same coset of K. Thus is injective. . 

The next proposition can be viewed as a converse of the First Isomorphism Theorem. 
Proposition 10.10 Let (A,-) be a normal subgroup of (G,-). Then there exists a 
surjective homomorphism f :(G,-) — (H,@) such that Ker f = A. 


Proof. This follows immediately from Theorem 10.2. 7 


For example, the function associated with the quotient group displayed in Figure 10.1 


The function associated with the quotient group displayed in Figure 10.2 is 


f(Q) = f(4) =H, f(a)=fle)=aH, 
f(o)=f(f) = oH, Fle)= f(g) = eH. 


Suppose we wish to find all the homomorphisms f : K — 5S, where K is the Klein 
4-group. The subgroups of K are K, {1,4}, {1,6}, {1c}, and {1}, and all of them 
are normal in K because K is abelian. Consequently, there are three homomorphisms 
that satisfy the requirements, namely K/K ={1}, K/{l.a}=EK/{Lb}eK/{1c}, 
and K/{1}=K. For f : S,; + K, the subgroups of S, are S;, A,, {1,2}, {1,5}, 
{1,c}, and {1}, but of these only S,, A,, and {1} are normal. Hence there are three 


homomorphisms that satisfy the requirements. 


240 QUOTIENT GROUPS AND THEIR USES 
Exercises 10.2 


In each of Exercises 10.2.1 to 10.2.18, find all of the homomorphisms f : G > H where 


G and H are the given groups. 
1. G is the trivial group and H is an arbitrary group. 


2.  G isan arbitrary group and H is the trivial group. 


3.5 G=H=(Z,,+) 11. G=(Z.,+), H =(Z,,+) 
4. G=(Z,,+), H =(Z,,+) 12. G=(Z,,+), H =(Z,,+) 
5. G=(Z,,+), H =(Z,,+) 13. G=(Z,,+), H =(Z,,+) 
6. G=(Z,,+), H =(Z, +) 14. G=(Z,,+), H =(Z¢,+) 
7- G=(Z,,+), H =(Z,,+) 15. G=(Z,+), H =(Z,,+) 
8. G=(Z,+), H=(Z,,+) 16. G=(Z,,+), H =(Z,,+) 
9° G=(Z,,+), H=(Z;,+) 17. G=(Z,,+), H = Klein 4-group) 
10. G=(Z,,+), H =(Z,,+) 18. G = Klein 4-group, H =(Z,,+) 


19. Prove the second part of the First Isomorphism Theorem. 


10.3. The Rigorous Construction of Fields 


If H is a normal subgroup of G, then, loosely speaking, we say that G/H inherits its 
binary operation from that of G. Some well-known groups, however, are subject to 
additional operations, as is the case, for instance, for the groups (Z, +) and (F [x], +) for 
any field F. We saw above that the multiplication of integers could be transferred to 
(Z,,, +) so as to convert it to arithmetic modulo 7. It will now be demonstrated that a 
similar procedure can be employed to yield rigorous constructions of both the complex 
numbers of Chapter 2 and the Galois fields of Chapter 7, as well as a host of new fields. 

Let F be an arbitrary field and let P(x) be any polynomial in F[x]. If we set G = 
(F [x],+) and let A be the set of all the polynomials in F [x] that are divisible by P(x), 
then H is a subgroup of G. The reason for this is that 0 = 0- P(x) is clearly a multiple 
of P(x), and if A(x)P(x) and B(x)P(x) are any two multiples of P(x), then so are 


A(x)P(x) + B(x)P(x) = (A(x) + B(x)) P(x) 


and —A(x)P(x) = (—A(x))P(x) multiples of P(x). 


THE RIGOROUS CONSTRUCTION OF FIELDS 241 


Since G is a commutative group it follows that H is necessarily a normal subgroup of 
G and so there exists a quotient group G/H. We shall show that when the polynomial 
P(x) is irreducible the quotient group G/H can be converted to a field. The fields so 
obtained include the Galois fields as well as a variety of new fields. Consider the case 
where F = Z, and P(x) =x? +x +1 which is known to be irreducible over F. By 
Corollary 9.11, the polynomials M(x) and N(x) in F[x] belong to the same coset of 
Ai if and only if 

—-N(x)+M(x)=M(x)-N(x)eH 


which is true if and only if P(x) divides M(x)— N(x). Since the difference of any two 


of the polynomials 0, 1, x, 1+. has degree at most 1, it follows that the four cosets 
0+H,1+H,x+H,l+x+H (10.11) 


are all distinct. Moreover, if M(x) is any polynomial of F [x], then, by Proposition 6.4, 


there exist polynomials Q(x) and R(x) such that 
M(x) = Q(x)(x? +x +1)+ R(x) (10.12) 


and R(x) is either 0 or else has degree at most 1. In other words, R(x) is one of the 
polynomials 0, 1, x, or 1+x. 

However, it is clear from Equation 10.12 above that M(x)— R(x) is divisible by 
x? +x+1 and hence M(x) and R(x) belong to the same coset of H. Consequently, 
every polynomial of F [x] belongs to one of the cosets in List 10.11. Thus, G/H has 
four elements only. (It happens to be isomorphic to the Klein 4-group, but that is of no 
interest to us now.) 


We shall next convert G/H, which already has the additive operation, 
(a+ H)+(64+H)=(a+6)+H 
to a field by defining a multiplication of its elements. Specifically, we define 
(a+H)-(6+H)=ab+H. 


These two operations on G/H can be tabulated; see Tables 10.4 and 10.5. In creating 


these tables we wrote the entry for (x + H)-(x + H)=x?+H as 1+x+H since these 


242 QUOTIENT GROUPS AND THEIR USES 


+ A 1+H x+H l+x+H 
A A 1+H x+H l+x+H 
1+H 1+A H l+x+H x+H 
x+H x+H l+x+H H 1+H 
l+x+A | l+x«+H x+H 1+H H 


H 1+H x+H l+x+A 
A H A A A 
1+H H 1+H x+H l+x+H 
x+H H x+H l+x+H 1+H 
l+x+H | H 14+x+H 1+H x+H 


Table 10.5 | Multiplication in Z,[x]/(x? + x + 1) 


two cosets are identical. These tables are in fact identical with those of GF(2, x? + x + 1) 
(Tables 10.6 and 10.7) if the H is either suppressed or replaced by zero and the x is 
replaced by a. 

Thus, our quotient G/H is none other than the Galois field GF(2, x + x + 1) in dis- 
guise. This laborious reconstruction of this Galois field has the advantage of mathematical 
rigor. Galois’s construction of his fields presumed the existence of at least one zero for 
every irreducible polynomial. The only justification given by Galois for this assumption 
was the analogy with the complex numbers. The construction given here, on the other 
hand, makes no such assumptions. 

The same method that was used above to construct GF(2, x” + x +1) can be applied in 
other situations to construct a wide variety of both old and new fields. Before describing 
this construction in its full generality, it is necessary to formalize some concepts. Two 
fields F and F’ are said to be isomorphic if there is a function f : F > F’ such that 

= f(4,)# f(a) for any distinct a,,4,¢F, 

" for each 6 € F’ there is an a € F such that f(a) = 6, 
= f(a,+4,)= f(a,)+ f(a) for any a,,a,€F, 

* f(4,4,)= f(4,)f (a) for any a,,4,€F. 


THE RIGOROUS CONSTRUCTION OF FIELDS 243 


+ 0 1 a l+a 


Table 10.7. Multiplication in GF(2, x? + x + 1) 


The function f is said to be an isomorphism. A comparison of this notion with that of 
the group isomorphism defined in Section 9.3 leads to the conclusion that every field 
isomorphism of F and F’ constitutes both a group isomorphism of (F, +) and (F’, +) on 


the one hand and a group isomorphism of (F*,-) and (F’,) on the other. The function 
f(rtsa)=(rtsx)+H 


constitutes an isomorphism of GF(2, x? + x + 1) and the above constructed quotient 
G/H. The first two properties are clearly satisfied whereas the last two properties follow 


from an examination of the above tables and the following arguments: 


f(r tat ry + 9a) = f(r, + 1) + (5, + Je] 
=(r,+7)+(5,+5)x+A 
=(7,+5,*+H)+(r,+5,x+H) 


= f(r +5a)+ f(r + a) 


244 QUOTIENT GROUPS AND THEIR USES 


and, since a =a+1 and x?+H=x4+14+H, 


fF (7, + 5,0) + 590)) = fn ty + 95, Jat 5,507] 

= U(r 7% + 5459) + (4 + 95, + 5 5)er] 
=(r7,7,+5,5)+ (454+ 45, +55,)x+ 
= 1 ty + (15) + 15,)x + 45,547? +H 
=(r,+5,x)(%+5,x)+A 
=(r,+5,x+H)-(7,4+5,%+H) 
= f(r, +5,0)- f(r + 50). 

We now generalize this procedure to arbitrary fields and irreducible polynomials, first 

addressing the issue of addition, and only later dealing with multiplication. 

If P(x) is any polynomial over the field F,, then we denote by (P(x)) the set of all the 
polynomials of F[x] that are divisible by P(x). The set (P(x)) is a normal subgroup 
of (F [x],+) and hence it determines a quotient group F[x]/(P(x)). The cosets of this 
group all have the unwieldy form M(x) +(P(x)), and it is convenient to replace this 


clumsy expression by [M(x)] = M(x)+(P(x)). It may be easily verified that in terms of 


this notation the addition of cosets assumes the form 
[M(x)] + [N(x)] = [AZ(x) + N(x)]. (10.13) 


The following proposition makes it easy to visualize the additive group F [x]/(P(x)). 


Proposition 10.14 Let F bea field and let P(x) be a polynomial of degree d over F. 
Then F[x]/(P(x)) is the set of [R(x)] such that R(x) € F[x] and either the degree of 
R(x) is less than d or R(x) =0. 


Proof, By definition, every element of F[x]/(P(x)) has the form [M(x)] for some 
polynomial M(x) over F. If M(x) is now divided by P(x) so as to yield 


M(x) = Q(x)P(x)+ R(x) 
with R(x) being either the zero polynomial or having degree less than @, then, since 


M(x)— R(x) = Q(x)P(x), 


THE RIGOROUS CONSTRUCTION OF FIELDS 245 


which is divisible by P(x), it follows that [M(x)] =[R(x)]. In other words, every 
element of the quotient structure F[x]/(P(x)) can be represented in the form [R(x)] 
where R(x) is either the zero polynomial or else has degree less than d. 


Suppose now that R(x) and R’(x) are two polynomials of degree less than d such that 


[R(x)] = [R’(*)]- 


It then follows that the difference R(x)— R’(x) is a polynomial of degree less than d that 
is divisible by the polynomial P(x) of degree d. Clearly then, R(x)— R’(x) must be the 
zero polynomial; in other words, R(x) = R’(x). 

Hence we have shown that the nonzero elements of F[x]/(P(x)) are in a one to one 


correspondence with the polynomials of degree less than d over F. a 


Accordingly, 


Z,[x]/(x? +x +1) = { [0], [1], [x], [1+ x]}. 
Z,[x]/(x? + 1)= {[0], [1] [x] +2] }, 
Z,{x]/(x? + 1) = { [0], [1], [2], [x], [1+ x], [2+ x], [2x], [14+ 2x], [2+2x]}, 
R[x]/(x? + 1) ={ [at bx] | abe R}. 
Some puzzlement may be caused by the similarity between Z,[2]/(x? +x +1) and 
Z,[x]/(x? +1), since both have Table 10.4 as their addition tables. In fact, as additive 
groups these two structures are indeed isomorphic. It is only when multiplication is also 


brought into play that the difference between them becomes evident. The additive group 


F [x]/(P()) is endowed with an operation of multiplication by the following definition: 
(Q(x) + (P(x))) (R(x) + (P(x) = Q(*)R(x) + (P(x), 
of, in terms of the bracket notation for cosets, 
[Q(x)] -[R(x)] = [Q(*)R(*)]. (10.15) 


The fact that this multiplication is unambiguous follows from an argument analogous to 


that used in the last paragraph of the previous section (Exercise 10.3.24). 


246 QUOTIENT GROUPS AND THEIR USES 
Thus, in Z,[x]/(x? +x +1), 
[1+x]-[1+x] =[(1+x)?] =[1+42x + x7] =[x], 
whereas in Z,[x]/(x? +1), 
[1+x]-[1+x]=[(1+x)] =[1+2x+x?] = [0]. 


One more concept is necessary for the formulation of this section’s main theorem. 
If F isa field and E is a subset of F which also constitutes a field with respect to the 
arithmetical operations it inherits from F, then E isa subfield of F and F is an extension 
of E. Thus, the complex numbers are an extension of the reals which, in turn, are 
an extension of the rational numbers. Similarly, every Galois field GF(p, P(x)) is an 
extension of Z 1,. We shall use this terminology even when F only contains a subfield that 


is isomorphic to E rather than £ itself. In general, isomorphic fields will be identified. 


Theorem 10.16 If F is any field and P(x) is any irreducible polynomial over F , then 
F [x]/(P(x)) is a field extension of F that contains a zero of P(x). 


Proof. We first demonstrate that F [x] /(P(x)) is indeed a field. Since the addition and 
multiplication of the elements of F[x]/(P(x)) are given by Equations 10.13 and 10.15 
it follows that these operations satisfy all but the last of the requirements in Section 6.1 
simply because the usual addition and multiplication of polynomials also satisfy those 
requirements. Thus, the additive and multiplicative identities of F[x]/(P(x)) are [0] 
and [1], and, by way of example, commutativity holds for multiplication in F | x]/(P(x)) 


because 


[Q(x] [R@)] = [Q(x )R(*)] = [RO)Q(*)] = [RO] [Q(*)]. 


However, since the multiplicative inverse of a polynomial is not a polynomial, more 
work is required to demonstrate that the nonzero elements of F[x]/(P(x)) possess 
multiplicative inverses. The argument we give here is a slight modification of the proof 
of Lemma 7.4 

Let Q(x) be any polynomial over F that is not in (P(x)), ie., Q(x) is a polynomial 


that is not divisible by P(x). Since P(x) is irreducible this is tantamount to saying that 


THE RIGOROUS CONSTRUCTION OF FIELDS 247 
P(x) and Q(x) are relatively prime, and hence there exist A(x), B(x) € F [x] such that 
A(x)Q(x) + B(x)P(x) = 1. 
Passing on to cosets, we conclude that 


[A(x)] - [Q(x)] = [A(*)Q(*)] = [1 - B(x) P(*)] = [1] 


the justification for the last equality being that 1— B(x) P(x) and 1 differ by an element of 
(P(x)) and hence they belong to the same coset of (P(x)). Since [1] is the multiplicative 
identity of F{x]/(P(x)), it follows that [A(x)] is the multiplicative inverse of [Q(x)], 
and so F[x]/(P(x)) is indeed a field. It is clear from the definitions of addition and 
multiplication in F [x]/(P(x)) that the collection of cosets F’ = {[r]| r € F } constitutes 
a subfield of F[x]/(P(x)) that is isomorphic to F, the isomorphism f : F — F’ being 
defined by f(r) =[r]. Hence, F[x]/(P(x)) is indeed an extension of F. Finally, we 
show that the coset [x] is a zero of the polynomial P(x) over the field F[x]/(P(x)). If 


P(x) = ayx"+a,x" 1 +--+, 
is a polynomial over F, then as a polynomial over F[x]/(P(x)) , it should be written as 


P(y) = [4g] 9” + [a,]y""" +-+-+[a,]. 


The reason the variable x of P(x) needs to be replaced by y is that [x] has become 
an element of the field F[x]/(P(x)). If the variable y of P(y) is now replaced by this 
element [x] of F[x]/(P(x)), then 


P([x]) = [49] - [*]” + [4)] - [*]"" +--+ [4] 
= [ayx”} + [a,x""] +---+[a,] 
= [ayx” +a,x"1+---+4,] 
= [P(x)] = [0]. 


In other words, the element [x] of F[x]/(P(x)) is a zero of the polynomial P(y), which, 
of course, is identical with the polynomial P(x). . 


248 QUOTIENT GROUPS AND THEIR USES 


Amongst other things, this theorem justifies Galois’s assertion (Section 7.1) that every 
polynomial that is irreducible over Z, has a zero which we called its Galois imaginary. 
Being mortals ourselves, we will not comment on why Galois could make such an assertion 
without falling into the pits where most unjustified assumptions lead. 

Before this theorem is applied to the creation of some new fields, we will show how 
it can be used to give a rigorous construction of the complex numbers. The polynomial 
x? + is irreducible over R and hence, by the above theorem, R[x]/(x? + 1) isa field 


that contains a subfield isomorphic to R and in which [x] satisfies the equation 
[x]? +[1] =[x? +1] = [0]. 


Since the function f(r) =[7] is an isomorphism of R onto a subfield of R[x]/(x? + 1), 
we may identify [7] with 7. In addition, let us label [x] by i, so that the typical element 
of R[x]/(x? +1) is 

[a+ bx] =[a]+[6]-[x]=a+ bi 


where a, 6 € R and 
i? = [x]? =—-[1]=-1. 


It is now clear that R[x]/(x? + 1) is in fact isomorphic to the complex number system. 
This might be the place to note that this particular application is in fact the origin of 
Theorem 10.16. In his paper of 1847, Cauchy used this very approach to justify the 
existence of complex numbers. 

The Galois fields of Chapter 7 can also be regarded as a special case of this construction. 
If P(x) is an irreducible polynomial of degree y over Z,, then, according to Proposi- 
tion 10.14, the elements of the field Z, [x]/(P(x)) all have the form 


ay +ay[x]+a,[x]}+---+a, [x] 


which, when [x] is interpreted as the Galois imaginary i, is identical with Expression 7.2. 
When this identification between the elements of Z,|x]/(P(x)) and GF( p, P(x)) is 
carried out, the arithmetical operations of the one are also identical with those of the other. 
In other words, Z,[x]/(P(x)) and GF(p, P(x)) are isomorphic fields. This observation 
is a special case of the following very general and very strong theorem whose proof falls 


outside the scope of this text. The order of a field is the number of elements it contains. 


THE RIGOROUS CONSTRUCTION OF FIELDS 249 


Theorem 10.17 Every finite field has order p” for some prime p and some positive 
integer 2. Given any such p and 7 there is a field of order p”, and any two fields of the 


same finite order are isomorphic. 


Consider next the field F = GF(2, x? + x + 1) where « is the Galois imaginary asso- 
ciated with x? +x +1. That is, F = {0,1,a,1+a} where a =a+1. The quadratic 
polynomial x? +x +a has no zeroes in F and is therefore irreducible over it. In accor- 


dance with Theorem 10.16 


F[x]/(x? +x +a) 


is a field. If we write A for [x] and replace [7] by 7 for all + € F, the elements of this 
field are 


[0} =0, A=; 

{o] =a, {[l+a]=lt+a, 

[x] =A, [1+x]=1+A, 

[a+x]=a+A, [Ita+x]=1l+a+a, 

[ax] = aa, [l+ax]=1+a, 
[at+tax]=a+ad, [ltatax]=l+ataa, 
[(l+a)x] =A+aa, [1+(1+a)x]=1+A+aa, 
[at(l+a)x] =a+Ataa, [l+at+(l+a)x] =l+atdAtaa. 


The elements of this field are to be added and multiplied as indicated by the following 


examples: 


(ltaA)+(ltatAta@a)=2+atAt2ad=ata, 


and, bearing in mind that e=atland a2 =A+a, 


(l+adA)\l+atAtad)=lt+atAt+adtaat+oAtaa torr 
=l+atAt(lt+ata(Ata)+(l+a)at+a) 
=1+atAtAtadAtadtartAatataata’ 


=1+A+aa. 


250 QUOTIENT GROUPS AND THEIR USES 


Since this field has the same number of elements as GF(2, x4 + x + 1), it follows from 
the unproved Theorem 10.17 that they should be isomorphic. This is borne out by the 


computation 
Ma(?PaAtaPaate=Atatat+l=Atl, 


which shows that A is a zero of the polynomial x4 +x +1 over Z,. In other words, we 
can think of A as the Galois imaginary of the field GF(2, x4 + x + 1). 

Let us examine some extensions of the rationals Q. Since there is no rational number 
whose square equals 2, it follows that the polynomial x* —2 is irreducible over Q. 


Consequently, Theorem 10.16 yields 


Qlx]/x* -2) 


as a new field. In accordance with Proposition 10.14, the elements of this field all have 
the form [a] +[4][x] where a and 6 are rational numbers and [x] is a quantity such 
that 

[x}*-[2] =[*°-2] = [0]. 


Since [x] behaves just like a square root of 2, we denote it by the formal symbol V2. 


This formalism notwithstanding, note that 
(V2) = [x]? = [2] =2 


in this field. If we persist in identifying [r] with r for each rational number 7, then the 
elements of this new field have the form a+ 62 for a,b €Q. 


The addition and multiplication of the elements of this field are given by 
(a, + 6, V2) + (a, + b,V2) = (a, +.a,)+(b, + b,)v2 


and 


(a, + 6, V2)(a, + 6, V2) = (aa, +26, b,) + (a, 6, +4,6,) V2. 


Note that 


—b 
: V2 


sas $5 V2. 
a—-2b* a—2 


(a+ V2) = 


EXERCISES 251 


Let us denote this new field by Q[] /(x?—2) by F,. The field F, can itself serve as the 
ground field for the construction of another field. Consider, for example, the polynomial 
x? —3. This polynomial is irreducible over F,. To justify this claim it suffices to show 
that the equation 


(x+yV2"—3=0 


has no solution wherein both x and y are rational. However, this equation simplifies to 
x? 42y? +2xyV2=3 


or 3 ; 
3-—x*—2y 
Qe 
2xy 
which cannot have rational solutions since 2 is known not to be a rational number. 


The quadratic x? —3 being irreducible over F,, it yields yet a new field 
F,[x]/(x? -3) 
whose typical element, when [x] is symbolized by V3, is 
(a, + b, V2) + (a, + by V2) V3. 


If we abbreviate 2/3 to the symbol V6 , then all the elements of F,[x]/(x? —3) can 
be written in the form 


at bvV2+cevV34+dv6, 


with a, 6,c,d €Q. 

It is clear that Theorem 10.16 can be used to construct a myriad of new fields. The 
general theory of these fields and their classification falls outside the scope of this book. 

We conclude this section with a warning about a possible source of confusion. While 
the powerful Theorem 10.16 resembles the Fundamental Theorem of Algebra, which 
asserts the existence of complex zeroes to every complex polynomial, the two theorems 
are distinct. The zeroes whose existence is guaranteed by Theorem 10.16 need not 
belong to the ground field, as is exemplified by the polynomial x* + x + 1 over Z,. The 
Fundamental Theorem of Algebra, above and beyond asserting the mere existence of zeroes 
of complex polynomials also places them back in the ground field, which Theorem 10.16 


does not do. 


252 QUOTIENT GROUPS AND THEIR USES 
Exercises 10.3 


For each pair F and P(x) in Exercises 10.3.1 to 10.3.8, describe a subfield of the complex 


numbers that is isomorphic to F [x]/(P(x)). 


1 F=Q, P(x)=x?-5 5. F=Q, P(x)=x?+x41 

2 F=Q, P(x)=x?-2 6. F=Q[x]/(x?—2), P(x)=x?—5 
3. F=Q, P(x)=x?+1 7 F=Q{[x]/(x?—-2), P(x)= x2 41 
4. F=Q, P(x)=x?+25 8. F=R, P(x)=x7+5 


For each of the fields in Exercises 10.3.9 to 10.3.12, find a field that is described in 


Chapter 7 and is isomorphic to it. 
9. Z,[x]/(x?+x+1) un. Z,[x]/(x4+x3 +x? +x 41) 


10. Z,[x]/(x? +x? +1) 12. Z,[x]/(x? + x +2) 


Explain why each of the fields in Exercises 10.3.13 to 10.3.16 is finite. 
13. Z,[x]/(x?+x4+1) 15. Z,[x]/(xt+x> +x?+x41) 
14. Z,[x]/(xi+x+1) 16. Zs[x]/(x? + 4x +2) 


17. Prove that the set of real numbers 
{a+ bV24+ceV3 | a,b,c <Q} 


does not constitute a subfield of the real numbers (with respect to the usual arith- 


metical operations). 


18. Prove that the set of real numbers 
{atov7+eV11|abceQ} 


does not constitute a subfield of the real numbers (with respect to the usual arith- 


metical operations). 
19. What is the multiplicative inverse of 2 + 3[.] in the field Q[x]/(x? —5)? 
20. What is the multiplicative inverse of 3 + 2[x] in the field Q[x]/(x? —7)? 
21. What is the multiplicative inverse of the element 1 + [x] in the field Q[x]/(x? +5)? 


GALOIS GROUPS AND RESOLVABILITY OF EQUATIONS 253 


22. Let r be any rational number such that 7 is not rational. Find a formula for the 
multiplicative inverse of a+ b[x] in Q[x]/ (x?—r). 
23. Suppose P(x) is a polynomial of degree d over a field F. Prove that there exists 


an extension F’ of F such that, counting multiplicities, P(x) has d zeroes in F’. 


24. Prove that the multiplication of the elements of F[x]/(P(x)) defined in Equa- 


tion 10.15 is unambiguous. 


25. Show that if P(x) is a reducible element of F [x], then the multiplication defined 


in Equation 10.15 does not yield a field. 


10.4 Galois Groups and the Resolvability of Equations 


We would like to conclude this chapter with a brief account of how the young Galois 
settled the question of the algebraic resolvability of equations. Because of the introductory 
nature of this text, such an account must of necessity be superficial, and a more complete 
exposition of the theory can be found in many graduate texts. 

Briefly put, if P(x) is an irreducible polynomial, with either real or complex coefficients, 
then Galois associated a certain group of permutations with the equation P(x) = 0 and 
then proved that an appropriate analysis of the group yields the answer as to whether or 
not this equation is algebraically resolvable. We shall now discuss both the group and its 
analysis. 

Let P(x) be an irreducible polynomial with integer coefficients. The Galois group of 
the polynomial equation P(x) = 0 is a group of permutations of the roots of the equation 
(recall that the existence of these roots is guaranteed by the Fundamental Theorem of 
Algebra of Section 3.3) that enjoys two properties: 

First, every rational expression in the roots that is invariant under all the permutations 
in the group has a rational expression in the coefficients of the equation. 

Second, conversely, every rational expression in the roots that is also a rational ex- 
pression in the coefficients is necessarily invariant under all the permutations of the 
group. 

Consider, for example, the cyclotomic equation x4 + x3 +x?+x+1=0. Since 
x —Llaxt+x34+x74x+41, it follows that the roots of this equation are ¢, €”, e°, 
and e4, where ¢ is the first 5th root of unity. It is known that the Galois group of this 
equation is the permutation group of order 4 generated by the cycle o=(ee7 4 €*). We 
will now illustrate the meaning of the two properties above by investigating this group’s 


effect on some rational expressions in these roots. Suppose first that 2, 6, c, and d are 


254 QUOTIENT GROUPS AND THEIR USES 


integers such that the expression 
p= (e)*(e7)*(e*)"(e?)? 


is invariant under o (and so it is necessarily also invariant under all the permutations in 


the Galois group since, in this case, they are all powers of a). This means that 
(e)*(e7)* (24) (€?)4 = (€?)*(e*)*(€9)"(e)4 (10.18) 


and hence 


a+264+4c+3d=2a+4b+3c+d (mod 5), (10.19) 


and subtraction surprisingly yields 
0=a+2b6+4c+3d (mod 5). (10.20) 


From this it immediately follows that 


p= et t2bt4et3d 1 
Thus, as required by the first property, the invariance of g under the permutations of the 
Galois group was sufficient to guarantee its rationality. Suppose now that we only know 
g to be rational. Since g is necessarily a fifth root of unity, it follows that g = 1 and 
so Equation 10.20 holds. Equation 10.20 entails Equation 10.19 and this one implies 
Equation 10.18. In other words, from the mere assumption of the rationality of @ it was 
possible to prove its invariance under o and all the elements of the Galois group. 

Galois himself gives two examples of these groups. If x,, x,,...,x,, denote the roots of 
the general equation 
1 


n n— _ 
x” + ax" +-+a, x +a, =0, 


then the Galois group of this equation consists of all the permutations of these roots. 
The Fundamental Theorem of Symmetric Polynomials, briefly mentioned in Section 6.4, 
asserts that every symmetric rational polynomial of the variables x,, x,,...,x, can be 
expressed as a function of the elementary symmetric polynomials of these variables. By 
Theorem 6.19, when these variables denote the roots of the above equation, their elemen- 


tary symmetric functions equal (—1)*a, for & = 1,2,...,”. Thus, the first condition is 


GALOIS GROUPS AND RESOLVABILITY OF EQUATIONS 255 


satisfied. That the second condition is satisfied is harder to show, and we will not do so 
here. 

The second example Galois gives is that of the cyclotomic equation x? — 1 = 0 where 
p isa prime number. Since the polynomial x” — 1 is never irreducible, it is necessary to 


divide out the factor x — 1 after which we get 
xP 4 xP P44 e41=0 (10.21) 


which can be proved to be irreducible (for prime p). The Galois group of this equation 


is the cyclic group (o) where 
aa(errer ee) 


where ¢ is any primitive p-th root of unity and & is any primitive root modulo p. In 


the special case of p = 17 where & is taken to be 3, we get 
o=(¢23 he es gs 2° rs gh zis es e ra ay gi es ‘ag y 


Since o is in general a cyclic permutation of p —1 elements, it follows that the Galois 


group of Equation 10.21 is isomorphic to (Z ,_,,+). 


-p 

The analysis of the Galois group that isda the algebraic resolvability of its 
originating equation has the form of a recursive procedure. If the group G has prime 
order p (so that it is isomorphic to (Z,, +)), then the equation is algebraically resolvable. 
Next, if G has composite order and it contains no proper normal subgroups, then the 
originating equation is not algebraically resolvable. If neither of these conditions hold, 
then G is a group of a composite order with a proper normal subgroup, say H. Now 
apply the same analysis to both H and G/H. If at any time we encounter a group of 
composite order without a proper normal subgroup, then the originating equation is not 
algebraically resolvable. Since at each stage the orders of H and G/H are smaller than 
the order of G, this process is bound to terminate. If all the groups of composite order 
encountered in this procedure have proper normal subgroups, the originating equation 
is algebraically resolvable. Otherwise, it is not so resolvable. 

As an example we consider the cyclotomic equation x)? — 1 = 0, whose irreducible 
part is 


x4 ell ld 4x41 =0 (10.22) 


256 QUOTIENT GROUPS AND THEIR USES 


and whose Galois group is isomorphic to (Z,,,+). The order of (Z,,, +) is composite, 
and H = {0,6} is a subgroup of H. Since (Z,,,+) is an abelian group, 17 is necessarily 
normal, and since (Z,,, +) is cyclic, so is G, =(Z,,,+)/H. Now H has order 2 which 
is a prime and so we are done with it. On the other hand, G, has the composite order 
6, and so, by Theorem 9.14, G, is isomorphic to (Z,,+). Consequently, G, itself has a 
(necessarily normal) subgroup G, of order 3. Thus G, has order 3 and G,/G, has order 
2, both being prime numbers. It follows from Galois’s theory that Equation 10.22 is 
indeed algebraically resolvable, whose resolution was first investigated by Gauss. 

Let us examine the general quintic equation. As noted above, its Galois group is 
isomorphic to 5S, which has the composite order 5! = 120. The symmetric group S, has 
the group A, of all the even permutations of { 1, 2,3, 4,5} as a subgroup, and it follows 
from Proposition 10.3 that A; is a normal subgroup of S; such that 5;/A, has order 2. 
However, while A, has a plenitude of proper subgroups, we will now show that none of 
these subgroups is normal. Consequently, the general quintic equation is not resolvable 


by radicals, a fact that had of course already been proved by Abel. 
Proposition 10.23 ‘The group A, has no proper normal subgroups. 


Proof: Suppose H is a normal subgroup of A, that contains some nonidentity element. 
We will show that H necessarily equals A; by demonstrating the following two statements: 

« H contains a 3-cycle; 

* H contains all the 3-cycles of A,. 
Since we already know that every even permutation is expressible as the composition of 
3-cycles (Exercise 8.4.13), it will follow that H = A. 

Proof that H’ contains a 3-cycle: Since all the elements of A, are even permutations, 

it follows that their disjoint cycle decompositions consist either of a single 5-cycle, a 
single 3-cycle , or a pair of transpositions such as (12 )( 34). If H contains, a 5-cycle, 
say (12345), then, because H is a normal subgroup of A,, it must also contain the 


element 


(12345) 1[(123)(12345)(123)"] 
=(54321)(123) 12345) 321)=(135). 


GALOIS GROUPS AND RESOLVABILITY OF EQUATIONS 257 


If H contains a pair of transpositions, say (12 )( 34), then, because H is a normal 


subgroup, it must also contain the element 


(12)(34)[(25 (341 2 (3.4 )[(25 (3.4)? 
=(12)(34)(25)(34)( 12) 34) 34)(25)=(152). 


Thus, if H contains any nonidentity element, then it necessarily contains some 3-cycle. 
Proof that H contains all the 3-cycles of A,: By the above we know that H contains 
some 3-cycle, say (123). Moreover, if (abc) is any 3-cycle of A, then by Exercise 8.2.23 


(12345) abc)12345)'=(at+1lb+1e+l) 


where the addition is computed modulo 5. Hence, since (12345)¢A; and H is 
normal in Ag, it follows that H must contain the 3-cycles (123), (234),(345), 


(451), and (512) as well as their inverses. Since 
[((12)(34)](123)[(12)(34)}7 =(12)( 34) 123)(34)(12)=(142), 


ic follows for similar reasons that H must also contain the 3-cycles (142), (253), 
(314), (425), and (531) and their inverses. As these exhaust all the twenty 3-cycles 
of A,, the proof is complete. . 


‘The above proposition generalizes to the statement that A, is simple for each n > 5. 

This concludes our attempt at an elementary description of Galois theory. It remains 
only to say a few words about the subsequent evolution of group theory. 

We saw that groups caught the attention of mathematicians because they provided the 
key to the question of which polynomials equations are algebraically resolvable. Thus, 
the Galois group of an equation contains all the information that is required to decide 
on its algebraic resolvability. Mathematicians subsequently went on to try to classify all 
these new structures and eventually a very clear cut and apparently also very difficult 
question crystallized. Is every finite group necessarily isomorphic to the Galois group of 
some equation with integer coefficients? As of the writing of this text the answer to this 
question is still unknown. 

Groups that contain no proper normal subgroups were seen to play a key role in Galois 


theory and are called simple groups. The commutative simple groups are the cyclic groups 


258 QUOTIENT GROUPS AND THEIR USES 


of prime order. The group A,, which is the subject of Proposition 10.23, is the smallest of 
the noncommutative simple groups. The appellation “simple” is not to be taken literally. 
These groups have in fact very complicated structures. The joint efforts of hundreds of 
group theorists resulted recently in the complete classification of the finite simple groups. 
This monumental work occupies about 14,000 pages of mathematical publications. The 


last finite simple group to be identified is affectionately known as the monster. Its order is 
24 . 320.59. 76.117. 13?- 17-19-23 -29-31-41-47-59-71 = 10%, 


The monster happens to be the group of vertex symmetries of a solid that resides in a 
space of 196,883 dimensions. Strangely enough, the number 196,884 features in some 


applications of non-Euclidean geometry to number theory, but that’s another story. 


Exercises 10.4 


1. Prove that no commutative group of composite (or infinite) order is simple. 

2. Suppose G is a subgroup of S,. Prove that, if G is simple and o(G) > 2, then 
GCA,. 

3- Prove that the group of vertex symmetries of the regular octahedron (Figure 9.6) is 
not simple. 

4. Prove that the group of vertex symmetries of the cube (Figure 9.5) is not simple. 


5- Suppose that G isa finite simple group and 1, #4 € G. Prove that every element 
of G can be expressed as a product of elements of the conjugacy class C(a) (cf. 


Exercises 10.1.24 to 10.1.30). 


6. Prove that A, is not a simple group. 


Chapter Summary 


We have shown that new groups can be obtained from old ones by the quotient operation. 
This operation was applied to additive groups of polynomials to produce a host of new 
and old fields. Finally, the notion of quotient groups permitted us to formulate Galois’s 


criterion for the resolvability of algebraic equations. 


CHAPTER SUMMARY 259 


Chapter Review Exercises 


Mark the following true or false. 
1. If is the Klein 4-group, then KK contains more elements than K’. 
2. K isa normal subgroup of D,. 
3- Zi, has a subgroup that is isomorphic to K. 
4. The number of distinct elements of Z,[x]/ (x? +2x +1) is 25. 
5- (Z,,+) isa simple group. 


6. The complex numbers constitute an extension of the rational numbers. 


New Terms 

conjugacy class, 234 monomorphism, 235 
conjugate, 234 normal subgroup, 228 
extension, 246 quotient group, 230 
homomorphism, 234 signature, 236 

kernel, 237 subfield, 246 
Supplementary Exercises 


1. Write a computer script which will decide whether a given subgroup of some group 
is normal. If the answer is yes, write out a multiplication table for the quotient 
group. 


2. Prove that the alternating group A, is simple for 7 > 5. 


Chapter 11 
TOPICS IN ELEMENTARY GROUP THEORY 


ieee CHAPTER DISPLays some of the methods and results of elementary group theory. 
Specifically, we demonstrate how many more groups can be constructed and classify, 


up to isomorphism, all the groups of orders 2p and p? for p prime. 


11.1 The Direct Product of Groups 


In this section we describe one of many methods for combining groups to produce new 
groups. If G and H are two groups, then their direct product, denoted by G x H, has as 
its elements the set of all the ordered pairs (g, 4) where g ¢ G and /¢€ H. The binary 
operation of G x H is defined by 


(g, AY gh’) =(g¢g', bh’). 


The associativity of this operation follows directly from the associativity of the group 
operations of G and H. The identity element of G x H is (1,,1,,), and the inverse of 
(g, A) is the pair (g~', A~'). Thus, Z, x Z, consists of the four pairs (0,0), (1,0), (0, 1), 
and (1,1) where 

(1, 1)(,0)=(1+1,1+0)=(0,1) and 

(1, 1), 1) =(14+ 1,141) =(0,0). 


Similarly, Z, x Z, consists of the six pairs (0,0), (0,1), (0,2), (1,0), (1, 1), and (1, 2) 


where 
(1, 1)(0, 2) =(1+0,1+2)=(1,0) and 
(1, 2)(0, 2) = (1+0,2+2) =(1, 1). 


Introductory Modern Algebra, Second Edition. 261 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


262 TOPICS IN ELEMENTARY GROUP THEORY 


It is clear that if G and H are finite groups, then o(G x H)=0(G)o(#). Itis also easy to 
see that the function f((g, 4)) = (4, g) defines an isomorphism of G x H and Hx G. 


It is equally clear that the subgroups 
G’={(g1y) | gx Gc} and H' = { (1g) | hx H} 


of G x H are isomorphic to G and H, respectively. In particular, o((g, 1,,)) = o(g) 
and o((1¢, 4)) = 0(4). The next proposition tells us how to determine the order of any 


element of Gx H. 


Proposition 11.1 If g and 4 are elements of finite orders in the groups G and H, 


respectively, then o((g, /)) is the least common multiple of o{ g) and o(/). 


Proof, Let k be the least common multiple of o(g) and o(/) and let d = o((g, 4)). Since 


(g, hy =(¢*, A y=(e: 1,) 


it follows that d divides &. Conversely, since 


(le 11) =lexy =(g,h)4 =(¢%, h*) 


it follows that 1, = g@ and 1,,= 4%. Thus d is divisible by both o(g) and 0(/), and 
so, by Exercise 4.2.31, d is divisible by &. Hence, d =k. rT 


It follows from Proposition 11.1 that Z, x Z, is a group of order 4 in which every 
nonidentity element has order 2, and hence this group is isomorphic to K. Similarly, 
the element (1,2) of Z, x Z, has order 2-3 = 6, and hence this group is isomorphic to 
(Z,+). On the other hand, the group Z, x Z, has order 8, is commutative, contains 
no elements of order 8, and contains an element of order 4, namely (0,1). This is 
enough information to justify the assertion that Z, x Z, is not isomorphic to any of the 
previously encountered groups of order 8, namely, the Quaternion group, D,, (Z,, +), 
and Z,[x,<2]. 

The preceding paragraph makes it clear that it would be useful to have some criteria for 
recognizing when a group is isomorphic to the direct product of some two other groups. 


This is now provided. 


Proposition 11.2 Suppose the finite group P contains two normal subgroups G and 
H such that GN H = {1,} and o(P)=0(G)o(H). Then P=G x H. 


THE DIRECT PRODUCT OF GROUPS 263 


Proof. We begin by proving that the elements of G commute with elements of H. Thus, 


suppose g € G and 4 e H. Since G and H are normal in P it follows that 


ghg'h'=(ghg")he¢HH=H and 
ghg th =ghg h")eGG=G. 


Since GN H = {1p}, it follows that ghg~'h7'! =1, andhence gh=hg forall geG 
and he H. 

We are now ready to prove the required isomorphism. Let f((g,4))= gh. This 
is clearly a function from G x H into P. If f((g,4)) = f(g’, 4’)) then gh = g’h’ 
or (g’) 1g = hh. However, (g’)"'g ¢ G and 4’! € H, and hence, since GN 
H = {1p}, it follows that either (g’)"'g = h’57! = 1, or both g =’ and =A’. 
It follows that f maps distinct elements of G x H to distinct elements of P. Since 
o(P) = 0o(G)0(H) = o(G x A), it follows that f does indeed match all the elements of 
G x H with those of P. Finally, making use of the above-proved commutativity, note 


that 


f(g, AN g BY) = f(g a bh’) = gg hh = ghg'h' = f(g. A) Fg" 4’) 
so that f is indeed an isomorphism. 7 


As a consequence of Proposition 11.2 we show that if p and q are any two distinct prime 
numbers, then (Z,,, +) = (Z,, +) x (Z,, +). Let P = (Zo4 +), G=(p)= (Z,, +), and 
H=(q)= (Z,,+). Then P, G, and H satisfy the hypotheses of Proposition 11.2, and 
hence the desired conclusion follows. The next corollary illustrates a somewhat more 
complicated application of Proposition 11.2. First, however, we note that it is clear that 
for any three groups G, H, and J that (G x H)x J = Gx(H x /), this isomorphism 
being established by the function f(((g, 4), &)) =(g,(4,4)). Consequently, we can 
unambiguously write G x H x J for (G x H)x J and Gx (Hx /). Further, if & is 


any positive integer, then we denote the direct product of & copies of G by G*. 


Corollary 11.3 If P isa finite group in which every nonidentity element has order 2, 


then there exists a nonnegative integer & such that P = (Z,,+)*. 


Proof. We proceed by induction on the order of P, the conclusion being trivially valid 
when o(P) < 2. We therefore assume that o(P) = and that the proposition holds for 


all groups of order less than 7. Let H be a proper subgroup of P which is contained in 


264 TOPICS IN ELEMENTARY GROUP THEORY 


no other proper subgroup of P and let a be any element of P that is not in H. Then, 
because P is a commutative group (Exercise 9.2.29), H’ = H U(aH) is also a subgroup 
of P. Since H’ contains both H and a, it follows from the maximality of H that 
H’ =P. Thus H has index 2 in P and so o(P) = 20(H) = 0((a2))0(#). It now follows 
from Proposition 11.2 that P = (a) x H =(Z,,+) x H. Since every element of H also 
has order 2 and H has order less than 7, it follows from the induction hypothesis that 
H =(Z,,+)* for some &, and hence P = (Z,, +) x (Z,, +) =(Z,, +)", . 


Exercises 11.1 


1. Let G and H be finite groups. Prove that the sets G’ = {(g,1,,)|g¢ ¢G} and 
H’ ={(1¢,4)|4¢H } are normal subgroups of G x H.. Prove that (G x H)/G’ = 
H and (Gx H)/H’=G. 

2. Let p bea prime and let G be a finite commutative group in which every noniden- 


tity element has order p. Prove that G x (Z,, +) for some nonnegative integer 


k. 
3. Let p bea prime. Prove that every commutative group of order p* is isomorphic 
to either (Z,2,+) or (Z,, +). 
4. Suppose m and 7 are relatively prime positive integers. Prove that (Z,,,, +) = 
(Z,,» +) x (Z,, +). 
§- Prove that (Zy99, +) x (Zp), +) = (Z, 43, +)  (Z393, +). 
A group is said to be a decomposable group if it is isomorphic to the direct product of two 
nontrivial groups. Otherwise, it is indecomposable. 
6. Prove that (Z,, +) is indecomposable if and only if n = p” for some prime p. 
7. Prove that Ds is indecomposable. 
8. Prove that S, is indecomposable. 
9. Prove that S, is indecomposable. 
10. Prove that S, is indecomposable. 
11. Prove that the Quaternion group is indecomposable. 
12. Prove that if 1 < & <n, then S, contains a subgroup P= S, x S,_,. 


13. Prove that if G and H are finite groups then G x H is cyclic if and only if both 


G and H are cyclic and their orders are relatively prime. 


MORE CLASSIFICATIONS 265 


14. Let p be any prime. Prove that for every positive integer » there are at least 
nonisomorphic groups of order p”. 
15. Prove that every finite group G of order greater than 2 has an automorphism that 


is distinct from the identity function. 
11.2. More Classifications 


We begin by reviewing some information that was relegated to the exercises of Chapters 9 
and 10. Two elements a and 6 of a group G are said to be conjugate if there exists an 
element x such that xax~! = 6. The set of all the elements of G that are conjugate to a 
is denoted by C(a) and is called the conjugacy class of a. Note that if a is conjugate to 
b, then 


1 


a=x"xaxx =x bx =x" b(x1y1 


so that & is also conjugate to a. Moreover, if is also conjugate to c € G, say b= ye y! ; 


then 


cay by =y"'(xax)y =(y"x)aly xy", 


so that a is also conjugate to c. These observations have the following consequence. 


Lemma 11.4 Ifa and & are elements of the group G, then C(a) and C(8) are either 


identical or disjoint. 


Proof. Suppose C(a) and C(6) are not disjoint, so that c ¢ C(az)N C(6) for some c € G. 
It then follows from the above observations that a and 6, both being conjugate to ¢, 
are also conjugate to each other. However, if d ¢ C(a), then d@, being conjugate to a, 
is also conjugate to 6, so that C(a) c C(6). By symmetry, C(6) c C(a), and hence 
C(a) = C(6). . 


The centralizer Z, of the element a of the group G consists of all the elements x in 
G such that xa = ax (see Exercises 9.4.43 to 9.4.51). Note that Z, always contains 1,, 
a, and every power of a. If 2 commutes with every element of G, so that Z, = G, then 
xax ' =a for all x € G, so that C(a) =a. If G is the Quaternion group (Table 9.3) 
Z, = Z,=G and Z, ={1,d,x,x7"} for all other x ¢ G. The following proposition 
shows that there is a very strong relationship between centralizers and conjugacy classes. 
Proposition 11.5 Let G be a group and ae G. Then Z, is a subgroup of G and 
IC(a)| = [6 : 2). 


1 1 1 1 -1 


Proof. lf x,y ¢Z,, then xya=xay=axy and x7!a=x7laxx7!=x7!xax! =ax"', 


so that xy, x7! € Z,. Hence, Z, is a subgroup of G. 


266 TOPICS IN ELEMENTARY GROUP THEORY 


To prove the second part, note that each of the following statements is equivalent to 

the next: 

= x and y belong to the same coset of Z, in G; 

sxtye Le 

* x! ya = ax 'y; 

. yay"! = xax!, 
In other words, the two elements x and y belong to the same coset of Z, if and only 
if they conjugate a to the same element. Equivalently, the elements x and y belong to 
distinct cosets of Z, in G if and only if they conjugate a to distinct elements of C(a). 
Either way, the cosets of Z, have been matched up in a one-to-one fashion with the 


elements of C(a), so that |C(a)|=[G: Z,]. 2 


Note that in the quaternion group xax~' =a or e according as x € {1,a,d,e} or 
not. Thus, |C(a)| =2=[G : Z,], since we saw above that Z, = {1,a,d,e}. 
The center Z(G) of the group G consists of all the elements of G which commute 


with every element of G. In other words, 


ZA(G\=(")Z,. 
aeG 
It is clear that Z(G) = G if and only if G is a commutative group. The following theorem 
provides a very useful tool in the search for the classification of groups. 
Theorem 11.6 Let G bea finite group. Then there exist elements nea Z(G) 
such that ? 
o(G)=0(Z(G))+ 9) [G: Z, ] (11.7) 
— 


f 


where, for each i, [G: Z,] > 1 and [G: Z, ]o(Z(G)) is a proper divisor of o(G). 


Proof. As observed just prior to Proposition 11.5, every element of Z(G) constitutes a 
conjugacy class by itself. Let C(a,),..., C(a,) be a list of the other distinct conjugacy 


classes of G. Since each element of G belongs in some conjugacy class, we have 


k 
o(@)= Z(G) + ICla,)b 


Equation 11.7 now follows from Proposition 11.5. 
If [G:Z, ]=1, then G = Z, , implying that 2, commutes with all the elements of G 
and contradicting the fact that 4, ¢ Z(G). Thus, [G: Z, ] > 1 for each i= 1,2,...,4. 


MORE CLASSIFICATIONS 267 


Finally, since each 2, € Z(G) it follows that Z(G) is a proper subgroup of each Z, , and 
hence [G: Z, ]0(Z(G)) is a proper divisor of [G : Z, ]o(Z, ) = 0(G). 7 


If G is the Quaternion group, Z(G) = { 1, d } and we can use a, =a, a, = 6, and a, =e, 
since C(a) = {a,e}, C(b)={ 6, f}, and C(c)={e, g}. 

Equation 11.7 is called the class equation of G. It has a surprising number of im- 
plications for finite groups. We demonstrate this first by classifying all the groups of 
order p” up to isomorphism. Note that this corollary is a very strong generalization of 


Proposition 9.17. 


Corollary 11.8 Let G beagroup of order p”, where p isa prime. Then G is isomorphic 
to either (Z,2, +) or to (Z,, +). 
Proof: If G is cyclic, it is isomorphic to (Z,2,+). Hence we may assume that every 
nonidentity element of G has order p. We first show that in the class equation of 
G, k=0, so that Z(G)=G and hence G is in fact commutative. To see this assume 
that & > 0 and let a,,...,4, be as in Theorem 11.6. Since o(G) = p’, it follows that 
[G:Z,]=p for each i = 1,2,...,&. Hence o(Z(G)) must also be divisible by p. This, 
haweves contradicts the fact that [G : Z, ]o(Z(G)) is a proper divisor of o(G) = p?. 
Thus, G is commutative. , 

Now that G is known to be a commutative group in which each nonidentity element 
has order p, let 2, 6 € G be such that (a) and ( 6) are distinct subgroups of G. Since they 
both have prime order p, it follows that (2) (6) = {1¢} and so, by Proposition 11.2, 


G & (a) x (6) = (Z,, +) x (Z,,+) =(Z,, +). . 


We are already familiar with two noncyclic groups of order p? for each prime p. These 
are the groups Z > [x,<1] and (GF(p, P(x)),+) where P(x) is any irreducible quadratic 
over Z,- Since all the nonzero elements of these groups have order p, it follows that 
they are isomorphic to (Z,, +)*. The next theorem, above and beyond its utility in the 
classification of finite groups, also provides a partial converse to Lagrange’s Theorem 
(Theorem 9.8) about the orders of subgroups. The actual converse to Lagrange’s Theorem 
is false (Exercises 9.4.28 to 9.4.38). 

Theorem 11.9 (Cauchy) _ If the order of the finite group G is divisible by the prime p, 


then G contains an element of order p. 


Proof: We proceed by induction on 0(G). The theorem is trivial for groups of order 


one. Let the prime p be fixed, let n be a positive integer divisible by p and assume the 


268 TOPICS IN ELEMENTARY GROUP THEORY 


theorem to be true for all groups of order less than x. Let G be a group of order » and 
assume that G has no element of order p. By Proposition 9.7 we may assume that the 
order of every element of G is relatively prime to p. 

Let 4,,4,,...,a, be as in Theorem 11.6. Since Z, is a subgroup of G, it contains no 
element of order p. Since Z, # G, it follows that AZ |, ) <7, so that by the induction 
hypothesis p cannot bea divisor of o(Z, ). However, 0G) =[G:Z, ]0(Z, ), and hence 
p must be a divisor of [G : Z, ] for gach i=1,2,...,k. It now follows fort the class 
equation (Equation 11.7) chat ? divides 0(Z(G)). 

Let z be any nonidentity element of Z(G) (z exists because o(Z(G)) > 1) and let 
H =(z). Since Z(G) is commutative, it follows that H is a normal subgroup of 
Z(G). Since p does not divide o( #7), it must divide 0(Z(G)/H) < n. By the induction 
hypothesis, Z(G)/H has some element, say 6H, 6 ¢ H, of order p. This means that 
(6H)? =H and hence 6? ¢ H. However, since o(6) and p are relatively prime, it 
follows that there exist integers A and B such that Ao(4)+ Bp = 1, and so 


b=h'= GA4)+Bp = (tery? eH, 


which is a contradiction. Thus G must contain an element of order p. 2 


We now have sufficient information to classify all the groups of order 2p where p is 
prime. 
Corollary 11.10 If p >2 isa prime and G isa group of order 2p, then G is isomorphic 
to either (Z,,, +) or D,- 


Proof: By Theorem 11.9, G has elements a and 6 of orders p and 2, respectively. Since 
H = (a) has half the elements of G, it follows from Proposition 10.3 that H is a normal 
subgroup of G. Consequently, bab7! « bH 67! = H = (a), and hence bab™! =a* for 


some ké€ Z,- However, 
bakb-! =(bab)(bab)---(bab“!) = ata ...at =a", 
and so 
a= bab? = b(bab")b = bab =a". 


It follows that £* = 1 (mod p) or &=+1 (mod p). 
If £=1 (mod p), then bab“! =a or ba =ab. Since o(a)= p and o(f) =2, it 
follows from Proposition 9.7 that o(a6) = 2p so that G =(Z,,, +). 


EXERCISES 269 


Otherwise, & = —1 (mod p) so that 6a67! = a™', or ba =a-'b =a’. In this 
case G is necessarily isomorphic to D,. To see this observe that the 2p elements of 
G can be listed as a°6°,...,a?-1 6°, a°b',...,a?-'b'. On the other hand, in D, let 
a=(12... p)andletB=(2 p (3 p—1):-:((p—1)/2(p +1)/2). Then fap =a", 
or Ba =a 'B=a?'f. It is now easily verified that the function f(a‘ 6/) = a’ B/ defines 


an isomorphism of G and D - (Exercise 11.2.12). rT 


Exercises 11.2 


1. Let G beacommutative group of order p, p,--- p, where p,, py,-... p, are distinct 
primes. Prove that G = (Z, >+) x (Z,,>+) Xr X (Z, » +). 

2. Prove that a finite group has exactly one conjugacy class if and only if it is trivial. 

3- Prove that a finite group has exactly two conjugacy classes if and only if it has order 


2. 


4. Prove that a finite group has exactly three conjugacy classes if and only if it has 
order 3. 


5- Let p and g be distinct primes. Prove that every noncommutative group of order 
pq has a trivial center. 

6. Let & and x be arbitrary positive integers and let p be a prime such that (&, p) = 1 
and n/2< p<n. Let f bea function of 7 variables that have & distinct variants. 
Prove that there is a p-cycle o in S, such that of =f. 

7- Suppose G is a commutative group of order p;'p,--- p,’ where p,, py»... P, 
are distinct primes. Prove that G has subgroups G,, G,,..., G, that have orders of 
P's Pys +++» p,?» respectively, such that G = G, x G, x--+ x G,. 


8. Classify the following groups according to isomorphism type: 


(a) (Ze, +) (e) V1 
(b) ((123456)) (f) Ss where f = x, xx, 
() ((123)(45)) (g) (Z;,-) 


(d) S, 


270 


TOPICS IN ELEMENTARY GROUP THEORY 


9. Classify the following groups according to isomorphism type: 
(a) (Zg, +) f) (GF(2, x? + x* +1), 4) 
(b) D, . (GF*(3, x* + x + 2),) 
(c) the Quaternion group (h) (Z,,+) x (Z,, +) 
(dd) V1 (i) (Z,,+) 
(ec) ((12345678)) 
10. Classify the following groups according to isomorphism type: 
(a) (Zy, +) (c) (GF(3, x? +x +2), +) 
(b) V1 (d) ((123456789)) 
(e) S, 0.f where f = x? Hy + XS HX, +0 +9 x; 
(f) (Z5,.+)° 
ir. Classify the following groups according to isomorphism type: 
(a) (Zy9s+) (b) D, («) V1 
(d) Ss where f = x, x, + XX + 3X4 + x4%5 + x5, 
(e) (Zj,,°) (f) ((12345)(67)) 
12. Prove that the function f defined in the proof of Corollary 11.10 is indeed an 
isomorphism. 
13. Find a group of order 16 that is isomorphic to neither (Z,¢,+) nor (Z,, +)’. 
Chapter Summary 


The direct product of groups was introduced as a means for constructing a host of new 


groups. Cauchy’s partial converse to Lagrange’s Theorem and the class equation proved 


to be useful tools in investigating the isomorphism types of groups. It was shown that 


for any prime number p there are only two nonisomorphic groups of order p* and only 


two nonisomorphic groups of order 2p. The consecutive solutions of certain landmark 


quadratic Diophantine equations leads mathematicians to question the traditional view 


of prime numbers as integers that cannot be factored into smaller numbers. 


CHAPTER SUMMARY 271 
Chapter Review Exercises 


Mark the following true or false. 
1. The direct product of (Z,,+) with S, has 32 elements. 
a> GReileyec. 


3. If G isa group, then the conjugacy class C(a) of any element a of G is a subgroup 
of G. 


4. If G isa group and ae G, then Z, is a commutative group. 
5: If G isa commutative group, then Z = G forallaeG. 
6. Every group of order 49 is commutative. 


7: If H isa normal subgroup of index 100 in G, then G/H contains an element of 
order 25. 


8. If p is a prime number, then every group of order 2p is commutative. 


New Terms 

center, 266 conjugate, 265 
centralizer, 265 decomposable group, 264 
class equation, 267 direct product, 261 


conjugacy class, 265 


Supplementary Exercises 


1. Prove that every finite commutative group is the direct sum of cyclic groups. 


2. Characterize all the finite groups that have exactly » conjugacy classes for as many 


positive integers 7 as possible. 


Chapter 12 


NUMBER THEORY 


\ ) 6 7 NOW EMBARK on a different journey, in pursuit of the ubiquitous rings and 

ideals. As was the case before (in Chapter 1) the journey began in ancient 
Egypt and Mesopotamia where someone discovered that some numbers are qualitatively 
different from others. The road then took us on to Greece, France, Switzerland, and 


Germany. 


12.1 Pythagorean Triples 


The Diophantine—Pythagorean equation x? + y? = z generalizes in many ways as well as 
inspires many fruitful questions and very clever answers. The quadratic residue theorem 
is encountered along the way, and we explain why Gauss considered it to be the jewel in 
the crown of number theory. The proof given here makes strong use of the geometrical 
property of plane lattices. 

The Theorem of Pythagoras, which states that in a right triangle with hypotenuse of 


length z, and shorter sides of lengths x and y, the equation 


ae ee (12.1) 
holds, is considered to be one the most important theorems of mathematics. Examples 
of such solution triples are { 3,4,5}, {5, 12,13}, and { 12,709, 13,500, 18,541}. The 
fact that the third of these was one of 15 such triples listed in the Babylonian tablet 
PLIMPTON 322, which dates between 1900 and 1650 BCE, attests to the fascination with 
which these triples were regarded, even as far back as 4,000 years ago. It was natural for 
the more mathematically minded scribes of the time to try to generate a list of all such 
triples. Euclid included part of this problem as Lemma I to Proposition 29 of Book X of 
The Elements: 


Introductory Modern Algebra, Second Edition. 273 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


274 NUMBER THEORY 


Proposition 12.2 If «> v are two positive integers, then x =2uv, y = u?—v’, and 
z=u?+v?* satisfy x? + y? = 2?. 
Proof. See Exercise 12.1.6. . 


The reason this solution was qualified above as only “partial” is that it failed to prove 
that all the desired triples can be so obtained. The statement and proof of this “converse” 
require some care. A Pythagorean triple is a list of positive integers (x, y,z), which 
satisfy Equation 12.1 with the proviso that (x, y, z) and (y, x, z) are considered to be 
the same Pythagorean triples. A Pythagorean triangle is a triangle the length of whose 
sides constitute a Pythagorean triple. It is easy to see that the common factor of any two 
members of a Pythagorean triple must also divide the third one and hence the condition 
that x, y, and z share no common factor but 1 is tantamount to saying that every two 
of them are relatively prime. Such triples are said to be primitive. It is clear that if g 
is the greatest common divisor of the components x, y, z of a Pythagorean triple, then 


(x/g,y/g,2%/g) constitutes a primitive triple. 


Lemma 12.3 If (x, y, z) is a primitive Pythagorean triple, then z is odd and exactly 


one of x and y is odd. 


Proof. \f x and _y are even, it follows that z is also even, thus contradicting the primitivity 
of the triangle. Nor can both be odd, since otherwise there exist positive integers a, 6, 
and ¢ such that 

(2a+1)? +(264+1) =(2c)? 


or 1+ 120 (mod 4), which is impossible. a 


Lemma 12.4 Let a, 6, and c be positive integers such that a” = bc and (6,c)=1. 


Then there exist integers « and v such that 6 = wv? and ¢ = v?. 


Proof. Let 
= Py y+ Pot 


be the prime factorization of a. Then 


2r, 27, 


2 Zr, 
4 = Py py” +++ Pm"- 


Since (6, c) = 1, it follows that after a suitable permutation of the subscripts 


_ reat rhe 2h, 
and = Past Pa + Pm 


PYTHAGOREAN TRIPLES 275 


where & is some integer between 1 and m. Hence we can set 


= 7145 ", — pp lbtl 7712 "mn 
u=p;'py.--p,) and c= pi pho... Pn’ a 


Theorem 12.5 A Pythagorean triple (x, y, z), in which x is even, is primitive if and 


only if it has the form 
x =2st, y=s?—2, and z=s*+¢? (12.6) 


where s and ¢ are positive integers of opposite parity, s >t, and (s,¢)=1. Every 


Pythagorean triple is an integer multiple of some primitive triple. 


Proof. Let (x, y, 2) be a primitive triple in which x is even. It follows that both y and z 


are odd so that wu =(z—y)/2 and v =(z + y)/2 are positive integers such that 


(5) way z-y sty 
=)= = — - ——— = 


4 2 2 


Note that any common divisor of u and v is necessarily also a common divisor of y 
and z (because y = v—u and z=u+1). Since (y, z) = 1, it follows that (uw, v) = 1. 
Now, the product of u and » is a perfect square (x/2)*. Since w and v are relatively 


prime, they must each be perfect squares too, say w = ¢? and v = s*. Thus 


x\2 

(5) =uv or x? =4s*t? or x=2st (12.7) 
and 

y=u-u=s—2? and g=ut+ve=s?+??. (12.8) 


The integers s and ¢ must be of opposite parity since otherwise y and z would both 
be divisible by 2, contradicting the primitivity of the triple (x, y, z). Finally « < v and 
(u,v) =1 imply that s > ¢ and (s,t)=1. 

Conversely, let s and ¢ be positive integers of opposite parity such that s > ¢ and 
(s,£) = 1. It is easily verified that if x, y, and z are defined by means of Equations 12.7 
and 12.8, then they form a Pythagorean triple with an even x. It remains to show that 
this triple is primitive. If p is any odd prime that is a common factor of both y and z, 
then p is also a common factor of the relatively prime 


Z+ 
sta J and t*= : 
2 


276 NUMBER THEORY 


contradicting the fact that (s,¢)= 1. As for the case p = 2, since s and ¢ have different 
parities y and z are both odd, and so 2 is not a common factor of y and z either. Thus, 


y and z have no common prime factors, i-e., (y, ) = 1. 2 


For example, if s = 1 and ¢ =2, then x = 4, y =3, and z =5. On the other hand, 
s =125 and ¢ = 54 yield x = 13,500, y = 12,709, and z = 18,541. 

To find all Pythagorean triangles with a side of length 18, begin by finding all primitive 
Pythagorean triples one of whose legs is a divisor of 18. Because the parameters s and ¢ 
in Theorem 12.5 are distinct, it follows that the length of every leg of any right triangle is 
at least 

s—2r?>4-1=3 


and equals 3 if and only if s = 2 and ¢ = 1, no right triangles have legs of length 1 or 
2, and 3 is only in the Pythagorean triple (3, 4,5). This yields the triple (18, 24, 30). 
Because 6 is even, it must be the side 2st where s and ¢ have opposite parities. This, by 
inspection, is impossible. A similar argument eliminates 18. This leaves only 9 which 


must equal s? — ¢?, since it clearly cannot equal either 2s¢ or s? + ¢*. The factorization 
9= 57-2? =(5+t)(s—2) 


has only the solutions s = 5 and t = 4. This yields the triangle 2(9, 40, 41) = (18, 80, 82). 
In order to find all the Pythagorean triangles (x, y, z) such that 40 < z < 50, some 
tedious work leads to the conclusion that the only positive integers z which are the sums 


of two squares s* + t? < 50 where (s,¢)=1 and s > ¢ are 


5=274+12, 10=374+ 17, 
13 =3*+2?, 17=4 417, 
25 = 4? + 3%, 26=57 +17, 
29 = 57 + 2?, 34 = 5? +37, 
37 =6 +12, 41=57+4?. 


Of these, the only primitive triple in the range 40 < z < 50 is (9,40,41). To find 
nonprimitive solutions we locate for each integer d all triples (x, 7, Z) such that 40 < 
dz <50. 


EXERCISES 277 


For d =2 the range of Z becomes 20 < Z < 25 and for this range only 25 is the sum 
of two squares, 25 = 4? +37. This gives the triple 2(7, 24, 25) = (14, 48, 50). 

For d = 3 the range becomes 14 < Z < 16 none of which numbers is the sum of two 
squares. 

For d =4 the range becomes 10 < Z < 12 and for this range only 10 is the sum of 
two squares, 10 = 3? + 17. This gives the triple 4(6, 8, 10) = (24, 32, 40). 

For d = 5 the range becomes 8 < Z < 10 and for this range only 10 is the sum of two 
squares with relatively prime sides. This gives the triple 5(6, 8, 10) = (30, 40, 50). 

For d =6 the range becomes 7 < Z < 8, which yields no triples. 

For d =7 the range becomes 6 < Z <7, which yields no triples. 

For d =8 the range becomes 5 < z < 6, and for this range only 5 is the sum of two 
squares. This gives the triple 8(3, 4, 5) = (24, 32, 40). 

For d = 9 the range narrows down to just 5 = 2? + 1”, which yields the triple 9(3, 4, 5) = 
(27, 36, 45). 

For d = 10 the range becomes 4 < Z < 5 and for this range only 5 = 27 + 1? is the 
sum of two squares. This yields the triple 10(3, 4, 5) = (30, 40, 50). 

The multiplier d cannot be greater than 10 because otherwise both the endpoints of 
the range are less than 5 which is the smallest positive number that can be expressed as 
the sum of two relatively prime and distinct squares. Hence the answer consists of the 


following list: 
(30, 40, 50), (14, 48, 50), (27, 36, 45), (9, 40, 41), (24, 32, 40). 


In conclusion, we note that Fermat took it for granted that Proposition 12.2 did 
completely account for all the Pythagorean triples. Theorem 12.5 was formulated and 


proved by Euler about 100 years later. 


Exercises 12.1 


1. Let (x, y,) be a Pythagorean triple. Prove the following statements: 
(a) At least one of x and y is divisible by 3; 
(b) Atleast one of x, y and z is divisible by 5. 


278 NUMBER THEORY 


2. Find all Pythagorean triples (x, y, z) with 

(a) 40 < min{ x, y} < 45 

(b) 45 < max{ x, y} < 50 

(c) 50<z<60. 
3- Find all the Pythagorean triangles having one side of length 481. 
4. Find all Pythagorean triples (x, y,z) with x < y and z=21. 


5. Determine the right triangles with integral sides whose areas equal their perimeter. 


(Hint: Work in terms of s and t.) 


6. Prove Proposition 12.2. 


12.2 Sums of Two Squares 


In his book Arithmetica, written in the third century sce, Diophantus included a variety 
of propositions regarding sums of squares, including Problem 8 of Book 2 which asks 
for the division of a square into two other squares. This problem is clearly inspired by 
the notion of Pythagorean triples. Fermat, coming upon this problem while reading the 
Arithmetica wrote in the book’s margin a note to the effect that a cube cannot be split into 
two cubes nor a fourth power into two fourth powers. He claimed to have a remarkable 


proof of the following statement: 
For any integer 7 > 2 there exist no positive integers x, y, z such that x” + y” = 2”, 


the proof of which, alas, was too long for the margin. Fermat's proof, if indeed he had one, 
accompanied him to the grave. 350 years later a proof, hundreds of pages long, which 
made use of state-of-the-art mathematical tools and also relied on much mathematics 
developed in the intervening centuries, was given by Andrew Wiles and Richard Taylor. 

One approach to obtaining Pythagorean triples could be to ask which positive integers 
are expressible as the sums of two squares? This problem was formulated by Fermat and 
solved, 100 years later, by Euler. We note in passing that every square is trivially also the 
sum of two squares, namely, itself and the square of “side” 0. The following proposition 


is, of course, very helpful in this context. 


Proposition 12.9 (Brahmagupta 598-660) _ If each of two positive integers is the sum 


of two squares, then so is their product. 


SUMS OF TWO SQUARES 279 


prime sum of squares prime sum of squares 


o, 17+1? 43 - 

3 - 47 — 

5 +2? 53 +77 
7 ~ 59 - 

11 7 61 57 +67 
13 2? +3? 67 — 

17 P+4 71 _ 

19 - 73 3? +8? 
23 - 79 _ 

29 245° 83 _ 

31 = 89 57 + 8? 
37 17+6 97 4749? 
41 +5? 


Table 12.1 Sums of squares 
Proof. Suppose m= a* + b* and n=c? +d”. Then 


mn=(a° + b*)\(c2 +d?) =a'c* +.a°d? + bc? + b?d? 
=(acy +(bdy —2(ac)(bd)+ (ad) + (bc) + 2(ad)\(bc) 
=(ac—bdl+(ad+bcy. = 
In view of Proposition 12.9 and the Fundamental Theorem of Arithmetic (Theo- 
rem 4.9), it should suffice to answer a restricted question: Which prime numbers are 


expressible as the sum of two squares? Some experimentation (see Table 12.1) would lead 


any interested student to correctly conjecture the following proposition. 


Theorem 12.10 An odd prime p can be expressed as the sum of two squares if and 


only if there is a positive integer » such that p = 4k +1. 


The proof is somewhat intricate and is broken up into several lemmas. 


280 NUMBER THEORY 


Lemma 12.11 An odd prime of the form 4& + 3 is not expressible as the sum of two 


Squares. 


Proof, Let p =4k +3 and suppose, by way of contradiction, that p is the sum of the 
two squares x? and y*. Since these two squares add up to the odd number p = 4k + 3, 
it follows that one of them, say x, is even and the other, that is, y, is odd. Hence there 


exist integers a and 6 such that 
4n+3=(2a)P +(264+1)P =4(a?+67+6)+1=1 (mod 4), 


which is impossible since 42 + 3 = 3 (mod 4). . 


Lemma 12.12 If (a, m) = 1, then the equivalence ax = 6 (mod m) has a unique solu- 


tion for each value of 6. 


Proof. This follows immediately from Corollary 4.4. ] 


Fermat’s Theorem (Theorem 5.15) states that if p is a prime and a #0 (mod p), then 
a’'=1 (mod p). 


It follows that 
[aleve]? =a?!=1 (mod p). 


In other words a'?—!)/2 is a solution of the equation x* = 1 (mod p). Since Z, isa field, 
q P ‘y 


this equation has only +1 as its solutions, and it follows that a'?-)/? = +1. 


Theorem 12.13 (Wilson’s Theorem) If p is a prime, then (p — 1)! =—1 (mod p). 


Proof: Because Z, isa field, 1 and p—1 are the only solutions of the equivalence x? = 
(mod p). In other words, 1 and —1 are the only elements of Z, that are their own 


inverses. Consequently 


(p— 1)! =1-(p—1)-(2-271)-(3-3-")---(p-1)/2- [(p- 1/2)" = 1 (mod p). 


For example, (3-1)! = 2 =—1 (mod 3), (5—1)! =24=-—1 (mod 5), and (7—1)! = 
720 =—1 (mod 7). 

Modular arithmetic, especially over a field with a prime number of elements, has 
analogs of many notions of real arithmetic. It would be nice to have a notion of positive 


and negative, and indeed a fruitful definition of modular “positive” numbers can be 


SUMS OF TWO SQUARES 281 


defined. Such is the concept of quadratic residue: The integer 2 # 0 is said to be a 


quadratic residue modulo p if there is a solution in Z, to the equation 
Qs 
x° =a (mod p). 


If no such solution exists, then a is a quadratic nonresidue. 


Proposition 12.14 (Euler's Criterion) Let a be a positive integer and p be a prime such 


that p}a. Then a is a quadratic residue modulo p if and only if 


a?-'/2 =1 (mod p). (12.15) 


Proof. The easy case where p = 2 is relegated to Exercise 12.2.1. Hence we may assume 
that p is odd. The proof adds a slight twist to the proof of Wilson’s theorem: instead of 
pairing elements of Z, whose product is 1 we pair elements whose product is a. If a isa 


quadratic nonresidue, then the equivalence 


m* =a (mod p) (12.16) 


has no solutions, so that 


-1=(p—1)! S(1-a)-(2-427!)---((p—1)/2-a[(p —1)/2)J") = a?-?., (12.17) 


On the other hand, if a is a quadratic residue modulo p, then there exists an element 
mel, such that both m and —m = p — m (mod p) are distinct solutions of Equa- 
tion 12.16. These roots of Equation 12.17 contribute a singleton each to the product in 


Equation 12.16 and hence in this case the product of Equation 12.17 simplifies to 


a | =... = ql?-!-2)/2. m.(p—m), 
or 
a\?-3)/2(_.q) = —a'?-Y/2 (mod p), 
and the desired equivalence (Equation 12.15) follows immediately. a 


For example, let p = 7. Then Equation 12.15 has a solution for a = 1, 2,4 and does 
not for 4 = 3,5,6. At the same time, (p — 1)! = 6! =—1 (mod 7), (7 —1)/2 =3, and, 
modulo 7, 1° = 2? = 4 = 1, whereas 3° = 5° = 6? =-1. 


282 NUMBER THEORY 


Those values of a for which Equation 12.16 has a solution are called the quadratic 


residues of p. It is clear from the definition that 
1°, 27, 37,...,((p—1)/2)?_ (mod p) 


are all quadratic residues of p. 
It follows from the above propositions that if p is prime, then the following three 
statements are equivalent: 
* a is a quadratic residue mod p; 
* x? =a hasa solution mod p; 
» g?-V)/2 =] (mod p). 


We add another equivalent statement: 


Lemma 12.18 Let p be an odd prime. Then the congruence 

x?+1=0 (mod p) (12.19) 
has a solution in Z, if and only if 

p=1 (mod 4). (12.20) 
Proof. Suppose Congruence 12.19 has a solution in Z,. Then so, clearly, does 
x?=—1 (mod p), 
which means that —1 is a quadratic residue modulo p. By Euler’s criterion, 
(-1)?-)/2 =1 (mod p) 


and hence the exponent must be even. Thus, there is an integer & such that (p —1)/2 = 2k 
or p = 4k + 1, which is tantamount to Congruence 12.24. The proof of the converse is 


similar and is relegated to Exercise 12.2.5. a 


Lemma 12.21 Let p =4k+3 bea prime and let »p be a sum of two squares, say 
np =a’ + b* for some a, b € Z, then 
* p divides both a and 6. 


* p appears in the prime factorization of with an even exponent. 


SUMS OF TWO SQUARES 283 


Proof. The first statement is clear if p divides a. If p does not divide a, then a has a 


multiplicative inverse a7! modulo p and it also follows from the hypothesis that 
a*+6?=0 (mod p). 


Division by a” gives (a7! 6)? + 1 =0 (mod p). In other words, x = a7! is a solution 
of Equation 12.19 and hence p = 1 (mod 4). This, however, contradicts the hypothesis 
p=4k+3. 

For the second statement, suppose that p does not appear in the prime factorization 
of 7 with an even exponent. Fix p and let 7 be the least positive integer such that the 
exponent of p in the prime factorization of np = a? + 6? is odd. By the first part of this 
proof, p divides both a and 4; and, of course, appears with positive even exponents in 


the prime factorizations of a”, 6”, and n. Hence the terms of the equation 


are in fact integers rather than just rational numbers and so they constitute a smaller 
counterexample than the previous one. This contradicts the minimality of the first 


counterexample. This contradiction completes the proof of the lemma. . 


‘The proof of the second part of the above lemma is a variant of mathematical induction 
called the method of infinite descent. It was formulated by Fermat in the context of this 
very topic. 

If x is any real number, then the floor function |x| denotes the unique integer |x| 
such that |x] < x < [x]+1. Thus, [2.5] =2, |—2.5] =—3, and |—0.5] =—1. 


Theorem 12.22 Every prime of the form 4& + 1 can be expressed as the sum of two 


relatively prime squares. 


Proof. Let p = 4k +1 bea prime. It follows from Euler’s criterion (Proposition 12.14) 
that there is anumber s such that s* =—1 (mod p). Consider the function of the integers 
x and y defined by f(x, y)=sx—y for 0 <x, y < ,/p. Since there are |,/p| + 1 choices 


for each of x and _y, the number of points in the domain of f is 


(l¥P] +1)" > (VP! = p 


284 NUMBER THEORY 


which is greater than the number of integers in the codomain of f. Consequently 
there are two distinct points (x,, y,) and (x, y,) such that f(x,,,) = f(x, 75), or 
5X, — J = 5%, — Jy, and so 


sx=y (mod p) (12.23) 


where x = x,—x, and y= y, —y,. Note that if x =0, then y =0 and vice versa; if 
y =0, then, since s a 0, it follows that x = 0. Hence neither x nor y are zero. Squaring 
Congruence 12.23 yields y* = s*x? =—x* (mod p), or x7 + y? =0 (mod p). It follows 


that p | (x? + y?). However, 


O<x7+ y? <2(,/py =2p, 


and so we conclude that 


x+y" = p. (12.24) 


The reason x and y are relatively prime is that every prime factor of both must also 


divide p and must therefore equal p. This, however, is impossible since it entails 
x+y? > p?+ p> =2p’> p, 


thus contradicting Equation 12.24. 2 


Proposition 12.25 The positive integer 7 is a sum of two squares if and only if ever 
P P 5 q y uy 
prime divisor of 7 of form 44 + 3 appears in the unique factorization of 7 with an even 


exponent. 


Proof: Suppose 
n=2! pi py PS a ds de 
is expressible as the sum of two squares, where p, = 1 (mod 4) for i = 1,2,...,d, and 
q; =3 (mod 4) for j = 1,2,...,e. By Lemma 12.21 each of the 5;'8 is even. 
Conversely, suppose each of the s,’s is even. If s; = 2, then we have the trivial expression 
2 


sp=s : +0°. The application of Proposition 12.9 and Lemma 12.11 shows that the 


number x defined above is also presentable as the sum of two squares. a 


Exercises 12.2 


1. Prove that Proposition 12.14 holds when p = 2. 


QUADRATIC RECIPROCITY 285 


p=4k+l p=4k+3 
p=o+0 p#o+0 
—l=a’ (mod p) -—1#a? (mod p) 


Table 12.2 A summary of the section 


2. Let g bea prime of form 44 + 3. Prove that x is not the sum of two nonzero 


squares. 
3. Find a representation of 5,525 as the sum of two squares. 


4. Determine which of the following integers is a sum of two squares: 98, 343, 735, 


1,428, and 4,680. Find such a representation if one exists. 
5- Complete the proof of Lemma 12.18. 
6. Prove that the prime p is of the form 4 +1 if and only if p divides n? +(n +1)’. 


7. Let n=2-3*-59-13°. Is there a Pythagorean triangle with hypotenuse 7? 


12.3 Quadratic Reciprocity 


The theorem that goes by the name of the Law of Quadratic Reciprocity has the distinction 
of having been reproved more times (over 200) than any other in mathematics, except 
for the Theorem of Pythagoras. Its validity was conjectured by both Euler in the decade 
1742-1751 and by Adrien-Marie Legendre in 1785. It was proved by Gauss in 1801. 
Gauss considered it to be his theorema aureum (golden theorem) and over his lifetime 
he produced at least six different proofs for it. It has also been called a crown jewel of 
number theory. The proof presented here is based on one found on Wikipedia which, in 
turn is based on a proof of Eisenstein’s. 


Given an (odd) prime p and an arbitrary integer a, their Legendre symbol is defined as 


1 if a is a quadratic residue of p; 
a 


(=) =1-1 if is not a quadratic residue of p; 


0 if p divides a. 


286 NUMBER THEORY 


For example, by the calculations immediately following the proof of Euler’s criterion 


(Proposition 12.14), : (2) (4) a 
G)-G)-G)-4 


We begin with a list of easily proved properties of the Legendre symbol. 


Theorem 12.26 Let p be an odd prime. Then 


a 
eta) 
a=b (mod p) implies that (=)-(-); 
if (a, p) = 1, then (=)-(5) 
oe 

(acs 


Proof, The first is a restatement of Euler’s criterion. The proof of the second is relegated 


a?-V/2 (mod p); 


to Exercise 12.3.12. The third statement is clear, and the fourth follows the second. The 


last two also follow from Euler’s criterion. a 


Theorem 12.27 (Fermat, Euler) If p is an odd prime, then 


(=) = (—1)-/8, 


QUADRATIC RECIPROCITY 287 
Proof, Let p be an odd prime. If p = 4k +1 for some integer &, then, modulo p, 


274 (2k)! =2-4-6-+-4k 
=2+4-6-+-2k-(2k+2)-(2k +4)-+-+(4k—4)-(44—2)-4k 
=2-4-6---2k-(—2k + 1)-(—2k +3)++-+(-5)-(-3)-(-1) 
=2-4-6---2k-(—(2k —1))-(-(2k —3))-+»+(—5) -(—3) - (=1) 
=(-1)*(2k)! (mod p). 


Since (2&)! is relatively prime to p = 4 + 1, it can be canceled out of the above congru- 


ence and we obtain 


2(e-)/2 = 22 = (1) (mod p). 


By Euler’s criterion and the previous equation 


(=) = 2(?-1)/? =(~1)?-/4 (mod p). 
P 


On the other hand, if p = 44 +3 then, modulo p, 


2+ (2k +1)! =2-4-6---(4k +2) 
=2-4-6---(2k+2)-(2k+4)-+-(44 —2)-4k- (44 +2) 
=2-4-6---(2k+2)-((2k + 1))-+-(—5)(—3)(-1) 
=(-1)**1(2k+1)! (mod p). 


Since (2k + 1)! is relatively prime to p = 4k + 3, it can be canceled out of the above 


congruence and we obtain 
yle-1)/2 po 2k+l = (~1)**! (mod p) 


By Euler’s criterion, 


(=) = 241 = (~1)?*)/4 (mod p). 
P 


288 NUMBER THEORY 


It remains to show that whenever p = 44 +1, 


ae = - : (mod 2), 
and whenever p = 44 +3, 
a = iia (mod 2). 
These tasks are relegated to Exercise 12.3.13. a 


The floor function is very useful in describing long division: If the number x is to be 
(long) divided by @, then 
x=d|x/d|+r (12.28) 


where r < d is the remainder. 


Lemma 12.29 (Eisenstein) Let p and qg be odd primes and let the variable » vary over 


the even numbers F = {2,4,..., p—1}. Then 


(2) = (—1)=.l9"/?}, 


Proof. For each wu E let r(u) denote the least positive residue of gu modulo p. For 
example, if p = 17 and g = 13, then w has the values 2, 4, 6, 8, 10, and 12 and r(z) 
assumes the values 9, 1, 10, 2, 11, and 3. 

We show that the function (—1)’") r() is in fact a permutation of E. The integers 
(—1)")+(w) are all even (in our running example, they are 8, 16, 10, 2, 4, and 14). This 
is obvious when r(z) is even. If r(1) is odd, then (1) r(u) =—r(u) < 0. However, 
by definition, 0 < r(u) < p and hence the least positive residue of r(u) modulo p is 
r(u)+ p, which has even parity, contradicting the assumption that r(1) is odd. 


‘The integers in r(z) are all distinct. For, if 
(1) r(u) = (1) r(v)_ (mod p), 


then vu =r(u)=+r(v)=+v (mod p). If « =—v, then uw and v are both even, p is 
odd, 3< ut+v<2p—3 and u+v=0 (mod p), which is only possible if u+v= p. 
Since u and v are even and p is an odd prime, we are forced to conclude that # = v 


(mod p). Since w and v are both in £, it follows that u =v. 


QUADRATIC RECIPROCITY 289 
The function (—1)’r() is now known to have the set 
{2,4,...,.2p—1} 


as its domain and its range, and is also known to be one-to-one. It therefore is a permuta- 
tion (or rearrangement) of F. Recall that, by definition, r() = qu (mod p) and hence 
we have the following chain of equivalences modulo p: 

2 he-(p—1) = (AY r(2)- (1 r(4)- AYO e(p—1) 


= (-1)')2g -(-1)"4)4q---(-1)'P(p — 1)q 
= (-1)'2)+74)+tr(-1) 2.4... (p—- 1)g?-D/ 


and hence, upon division by 2-4---(p—1), we have 


(2) = ge? YP = ere} (mod p) a (-1)= r(u) (12.30) 


where w varies over E. 


However, (long) division of gu by p yields 


gu= 9] | +r) 


Sum this equation over all u € £ to obtain 


ee Do| +r 


Since all the z’s are even, it follows that 


> || =J7r(u) (mod 2). 


P 
Since p is odd, 


9 =| = Jo r(u) (mod 2). 


P 


In view of Equation 12.30, we are done. | 


290 NUMBER THEORY 


q 

C 
q 
2 

B 
0 


0 2 P 
2 


Figure 12.1 Lattice point diagram 


SS — 
oO KF NW 


oF NWO KU DWN CO WO 


0123 45 6 7 8 9 10 11 12 13 14 15 16 17 


Figure 12.2 Example showing lattice points inside ABC with even x-coordinates for p = 11 


and g = 13 


QUADRATIC RECIPROCITY 291 


SCF nvRUDANDOUV ST 


0123 45 6 7 8 9 1011 12 13 14 15 16 17 


Figure 12.3. The number of points with even x-coordinate inside BC YX is equal modulo 2 to 
the number of such points in CZ Y 


13 


_ 
iw) 


SCF NYWRUANwDAV ET 


012 3 45 6 7 8 9 1011 12 13 14 15 16 17 


Figure 12.4 The number of points with even x-coordinate inside C Z Y is equal to the number 
of points with odd x-coordinate inside AX Y 


292 NUMBER THEORY 


Yet another definition is called for here. A point (x, y) with integer components is 
said to be a /attice point. If S is any subset of the plane, then £(S) and O(S) denote 
the number of lattice points in S whose x-coordinate, respectively, is even or odd. Also, 


L(S) = E(S)+ O(S) is the total number of lattice points in FE. 


Proposition 12.31 (Eisenstein) If p and q are distinct odd primes then 


(p-W/2) ag 
(2)- (ens, 
P 


Proof. Let p and q be two distinct odd primes and let 


A={0,0}, B={p,0}, C={p.q}, D={09} 


be the rectangle of Figure 12.1 (where p = 17 and q = 13). It follows from Eisenstein’s 


Lemma 12.29 that 


(2) = Garyero = (—1?* Y), (-1)F@Crx) 


= (eer reiean) (mod 2) 


because 
(—1)FBCYXHE(CZY) = (g i l =0 (mod 2) 


and so, applying Eisenstein’s Lemma 12.29 again, we obtain, modulo 2, 


—1)/ 


)/2 
(£)= (—1)EAN HOAX) — (_y )HAXY) = lia/ pl . 


P 


j=l 


Theorem 12.32 (Law of Quadratic Reciprocity) If p and q are distinct odd primes, 


(2) = (-1)l-V/Alltg—0/21, 
P 


then 


Proof. Referring back to Figures 12.1 to 12.4, 


(s)-vaten 


EXERCISES 293 


By symmetry, 
I ) L(AW Y) 
= }=(-1) ‘ 
( q 


Since the diagonal AC contains no other lattice points than A and C, it follows that 


golg=l 


5 SHAW YX) = L(AXY) + (AW Y) 


and hence, by Lemma 12.31, 


(-1)eale-vA) = (yeaerny_q yawn) = (2) (2) - 
p/\q 


Exercises 12.3 


1. Consider ax? + bx + ¢ =0 (mod p) where p is an odd prime and pta. Let 
D = 6? —4ac. Prove that the congruence has 
(a) no solutions of D if a is a quadratic nonresidue, 
(b) a unique solution if p | D, 
(c) exactly two solutions if D is a quadratic residue of p. 

2. Suppose a is a quadratic residue of the odd prime p, and prove that —a is also a 
quadratic residue of p if and only if —1 is a quadratic residue of p. 

3- What are the least positive residues of the quadratic residues of 31? 

4. Is x? =—2 (mod 263) solvable? 

5. Let p be an odd prime. Prove that the product of the quadratic residues of p is 
congruent to —1 or 1 modulo p according as p = 1 (mod 4) or p = 3 (mod 4). 

6. Suppose p is a prime of the form 8k + 3. Does p divide 2°°-)/?) — 1? 

7- Characterize the odd primes p #7 such that x* =7 (mod p) has a solution in p. 

8. Characterize the odd primes p # 11 such that x* = 11 (mod p) has a solution in 
p- 

9. Calculate the following by hand: 


@ (5) ©) (3) © (3) @ (¥) 


294 NUMBER THEORY 


10. Calculate the following by hand: 
02) oo) e8)  o() 


1109 1933 3313 4783 


11. Let p >3 be an odd prime. Prove that the sum of the quadratic residues of p is 
divisible by p. (Hint: Recall that 37”_, i? = n(n + 1)(2n + 1)/6.) 
12. Prove the second part of Theorem 12.26. 


13. Complete the proof of Lemma 12.29. 


12.4 The Gaussian Integers 


Having proved the Law of Quadratic Reciprocity, Gauss went on to look for higher-order 
analogs. To be specific, he worked on cubic and biquadratic (alias quartic) versions. 
Amongst other things he discovered that the results and their proofs were considerably 
simplified by widening the scope to complex “integers”, rather than restricting them to 
the standard integers of Z. We now turn to the study of this extension. 

The Gaussian integers, so named for obvious reasons, consist of all the complex numbers 
of the form a+ bi where a, b € Z (see Figure 12.5). This new number system is denoted by 
Z[V/—1], consistently with the definition of F[x] in Section 6.1, and 17+ 9i, 12—14i, 
—14i, and 5 are all Gaussian integers. It was only natural for Gauss to seek out the primes 
of this new number system. Even the elementary fact 5 = (2 +i)(2 —i) is sufficient to 
indicate that, as mathematicians like to say, something is going on here. 

However, some preliminary remarks are in order before we delve into the issue of the 
factorization of the Gaussian integers. Note that we stand in some danger of confusion 
as 5 seems to be a prime in the context of Z but not so in the wider context of Z[v-1 |]. 
To clear this up we henceforth designate the classical integers, Z, as the rational integers 
and their primes as rational primes. In contrast the Gaussian primes are those of Z[V—1]. 
The reason for the first appellation is that the primes of Z do not involve i= /—1, which, 
by analogy with /2 and /3 are irrational numbers. 

In this textbook we have so far discussed the issue of factorization in two contexts: the 
integers Z and the polynomials Q[x]. For the purpose of uniqueness of factorization 
of the integers it was necessary to regard —p and p as redundant. Similarly, when 
factoring polynomials we consider cP(x) and P(x) to be essentially the same factors. 
These observations can be unified by defining the units of Z to be 1 and —1 and the 


units of Q[x] to be all the nonzero rational numbers, Q”*. 


THE GAUSSIAN INTEGERS 295 


Figure 12.5 The Gaussian integers Z[/—1] 


Notice that the units of these two systems have the property that the inverse of a unit 
is again a unit, and we take this to be the defining characteristic of the units of Z[/—1]. 
Specifically, the units of the Gaussian integers are the numbers 1, —1, i, and —i. Two 
Gaussian integers z and w are said to be associates provided there is a unit « such that 
uz =w. For example, 2—3i, 3+ 2i, —2+3i, and —3 —2i are all associates of each other. 
In general associates come in groups of four, 0 being the only exception. 

The Gaussian integer z is a Gaussian prime provided that whenever 2 = zw is a 
factorization of z, then either z or w is a unit. We now define the ultra useful norm 


function N — Z[i] via 
N(a+ bi) =a? + 6? =|a+ bil’. 
‘The following lemma provides a very strong, though still incomplete, tool for identifying 


the Gaussian primes: 


Lemma 12.33 The norm function is multiplicative in the sense that N(zw) = 
N(z)N(w) for all z,w eC. 


Proof. This follows from Proposition 12.9 and the fact that for any complex number z, 
N(z)=|z/*. . 


296 NUMBER THEORY 


Lemma 12.34 The units of Z[/—1] are characterized by the property that their norm 


sale 


Proof: Suppose u is a unit. Then there exists another Gaussian integer, say w, such that 
uw = 1 and hence N(uw) = N(u) N(w) = 1 where N(z) and N(w) are positive rational 
integers. Consequently, N(w) = 1. 

Conversely, let N(w) = 1. Hence, if «=a + 6i, then a? + 6* =1 where a and 6 are 
rational integers. The solutions of this equation are u=1, w=i, u=—1, and w=-i. 


Since each of the last four numbers has an inverse, it follows that u is indeed a unit. m 


Table 12.3 lists factorizations for all the Gaussian integers with norm at most 50 and 
which lie in the first quadrant. Prime Gaussian integers are most easily recognized by 


their norms: 


Corollary 12.35 If z is a Gaussian integer whose norm is a rational prime, then z is a 


Gaussian prime. 


Proof. Suppose N(zw) is a rational prime p. Then N(p) = N(zw)=N(z)N(w). It 
follows that either N(z) = 1 or N(w) = 1 which implies that either z or w is a unit. 


Hence z is a Gaussian prime. a 


For example, N(1 —i) = 2, N(1+2i)=5, N(—3 + 2i) = 13, N(4+i)=17, and N(5— 
2i) = 29. Hence 1—i, 1+2i, —3 + 2i, 4+i, and 5 —2i are all Gaussian primes. 


Proposition 12.36 Let p be an odd rational prime. Then p is also a Gaussian prime if 
and only if p has the form 4& + 3, if and only if p does not have an expression as the 


sum of two squares. 


Proof. Let p be a rational prime of form 44 +3. If p is not a Gaussian prime, then 
there exist nonunit integers « and 6 such that p = af. Taking norms, p? = N(«) N(§). 
Since @ and # are nonunits, both must have norm p. This means that if « = 4+ 6i, then 
p =a’ +6", contradicting Theorem 12.10. 

Alternatively, if p = 4 +1, then there exist rational integers a and 6 such that 
p= a+b°. So if we set =a + bi, then xz = p and so we have a factorization of p 


into nonunits. In other words, p is not a Gaussian prime. | 


The readers are reminded that the Euclidean algorithm plays a central role in the 
factorization of the rational integers and we now set out to show that an analogous 


procedure exists for the Gaussian integers. This is far from obvious as it will soon be 


297 


THE GAUSSIAN INTEGERS 


[1]Z Ul suonezioney sug «= €TE aqqey, 


Ab+Z(E+ DIR =t+Z (I—p)E+ 1) =1E+¢ I+y 
(1—Z)E+7E+D=1S +S (I+ p)I+I)=1S+€ bE w+ LI 
AI-ZE+ IIH LF 1 0s Wty ze (I+ D-=F 91 
L 6 IZ+¢ +E 
(1+ 7)E=1€+9 IG+Z 67 1€+Z €l 
(I-Z)IE=19 + € Sr (Z—€E)E+ 1) =14+6 (1-Z)E+1I)=I+€ 
+s (Z+EMEF+ IHS +1 97 (I+ 7+ D=!E+1 Ol 
IS+h ly (-ZE+7=S € 6 
(1-7) I+ II = 17 +9 AI-ZI=le +4 1+Z 
(+7 (E+ DI =19+7 OF Alt+Z= +E Sz +1 S 
1+9 (4+7)(i+ DIR =l0+h T+ 
I9+T LE (IZ) E+ 1) = +7 0z AI+1I-=Z y 
A+ 1)-=9 9€ (I+ LE =e +E 81 T+] z 


SIOIDET wiou SIOIEJ wiIOU SIOIOEF wou 


298 NUMBER THEORY 


demonstrated that there are many number systems very similar to the Gaussian integers 
which do not possess such an analog. 

Let a, B, and y be Gaussian integers such that a8 = y. Then we say that @ is a divisor 
of y and y is a multiple of a. The integer y is a greatest common divisor (Gcp) or highest 
common factor (HCF) of a and # provided that 

(a) y divides both & and B, and 

(b) y isa multiple of every common divisor of a and B. 


If such an HCE indeed exists, then it is denoted by (a, 8). 


Lemma 12.37 (Division Algorithm) Let z and w be two nonzero Gaussian integers 


such that N(z) > N(w). Then there exist two Gaussian integers g and r such that 
z=qw+r and N(r)<N(w). (12.38) 


Moreover, 


(z, w) =(w, r) (12.39) 


and there exist Gaussian integers A and y such that 


(z,w)=Az+puw. (12.40) 


Proof. Let z=a+ biand w=c+di. Then 


z atbi . 
-= =s+Ti 
w ctdi 
where 
act+bd bc—ad 
s=—— and t=——. 
c7 +d? ce +d? 


Let s’ and t’ be the rational integers closest to s and +, respectively. Then clearly 


1 


, and |t—#']s 5. (12.41) 


1 
Is-s1S5 
2 
Thus, we could think of g = s’ + ¢’i as the Gaussian integer that is closest to z/w. Set 


e=—-4 (12.42) 


and note that 


THE GAUSSIAN INTEGERS 299 


By Equation 12.41 z = qw+pw where z, w, and 6 are all Gaussian integers, so that 


ow is also a Gaussian integer for which 
N(ow) =N(o)N(w) < 1-N(w) =N(w). 


So if we set r = ow we are done. 

As for Equation 12.39, it follows from z = gw + r that every common factor of z and 
w is also a common factor of w and r, and, vice versa, every common factor of w and 
r is also. acommon factor of z and w. The desired equality follows immediately. Finally, 
the existence of A and yu follows from the same argument that was used in Proposition 4.1 


to prove the existence of A and B. a 


Proposition 12.43 For any two Gaussian integers a and 8, not both 0, the HcF (a, #) 


exists and every two HCEs of @ and B are associates. 


Proof: Let I be the set of all the Gaussian integers of the form ga + o6 where 9, p € 
Z{V/—1]. Let & be an element of J of minimum positive norm, and suppose 6 = Aa + uf. 
We will show that 6 is an HcF of « and f. 
First, note that by the Division Algorithm there exist « and 9, with N(e) < N(é) such 
that 
0 =a—Kd =a—K(Aa+ UB) = (1—KA)a + (—Ky)B 


with, moreover, N(o) < N(8). This is impossible unless ¢ = 0. This, in turn, implies 
that 5 | @ and similarly 6 | 8, and so 6 is a common divisor of a and 8 (first property). 
Finally, if y divides both @ and #, it must divide every integer combination of a and B, 


including 5 (second property). r 


This proposition is the Gaussian analog of the process of division in the context of the 
rational integers and it is customary to refer to g and r as the quotient and remainder, 
respectively, resulting from the division of g by r. Note that in the Gaussian case there 
are occasionally some arbitrary choices called for. For example, if (s, ¢) is the center of 
the unit square that contains it, then any of the four surrounding points will serve as 
(s’, t’). The outcome is the same regardless of the choice (why?). 

Let m and n be two Gaussian integers such that N(m) > N(). Set m, = m and n, =n, 
and let g, and 7, be the appropriate quotient and remainder so that m, = q,n, + r,. For 
i= 1,2,3,... weset m,,, =m, and n,,,= 7, with q,,, and r,,, being the respective 


Note that if 7; # 0, then either 


quotient and remainder when m,,, is divided by x; re 


300 NUMBER THEORY 


r,,; =0 or else N(7,,,) < N(7;) =N(7,). Because all the N(7;)’s are rational integers this 
process must eventually produce the first & for which r, = 0. Thus (m, 2) = r,_,. 
For example, let z =5—i and w= 4+ 2i. Then 


zB 5-1 9 7i 


we 292). 10. 10° 


Consequently r’ = 1, s’=—1, and g = 1—i, from which we obtain pw = z-quw = 
5-i-(1-i)(4+ 2i) =—1+i. In other words, (5—-i, y + 2i) =—-1+i. 
For another example, we will find a ccp of —15—i and —9 + 5i. Successive applications 


of the division algorithm yield the following equations: 


—15-i=(1+2i)(—9 + 5i)+(4+4 12i), 
—9 + 5i= i(4+ 12i)+ (3 +3), 
44 12i=(2+ 3i(3+i)+(1 +i), 


34+i=(2-i(1+i). 


Hence the required ccp is 1 +i. 


Proposition 12.44 If g is a greatest common divisor of the two Gaussian integers z 


and w, then there exist Gaussian integers A and B such that Am+Bn= g. 


Proof. See proofs of Propositions 4.1 and 6.14. ] 


Proposition 12.45 Let = be a Gaussian prime and z and w Gaussian integers such 


that 2 | zw. Then either 2| z or z| w. 


Proof. Suppose x does not divide z. Since x is a Gaussian prime that does not divide 
z, it follows that (z, z)= 1. Let A and B be Gaussian integers such that An + Bz = 1. 
Multiply by w to get Arw + Bzw = w. Since x divides each of the summands on the 


left it follows that z divides their sum w. a 


Theorem 12.46 Every nonzero and nonunit Gaussian integer can be factored into 


Gaussian primes in an essentially unique way. 


Proof. The proof of the existence of a factorization into primes proceeds by mathematical 
induction on the norm. The integers +1 +i are all the Gaussian integers of norm 2. 
Since they are also primes, the induction has been anchored. Assume the existence of a 


prime factorization has been demonstrated for all numbers of norm less than 7, and let 


THE GAUSSIAN INTEGERS 301 


z be a number with N(z) = 2. If z is composite, then there exist Gaussian integers z, 
and z,, both nonzero and nonunits such that z = z,z,, N(z) = N(z,) N(z,), and, hence, 
N(z,) < N(z) = and N(z,) < N(z) =. By the induction hypothesis, both z, and z, 
have prime factorizations which, together, yield a prime factorization of z. By induction, 
the existence of a prime factorization has been demonstrated. 

We next turn to the issue of uniqueness. The proof proceeds again by induction on 
the norm. If z has norm 2, then, by Lemma 12.33, z is a prime and so there cannot be 
another factorization into primes. This anchors the induction process. Assume that the 


uniqueness has been established for all integers of norm less than 7 and that N(z)= 7. 
Let 


A, h, Bos — k, ky k, 
Pi Pr “Dy =Z=4, q “4s 


be two essentially distinct prime factorizations of z. Since p, divides 7, it follows from 
a repeated application of Proposition 12.45 that p, =q; for some subscript 7 and some 
unit 7. It follows that z/p, = z/nq;. Relabeling, if necessary, rewrite z/p, = z/q. 
Since the common norm of both sides of this equation is N(z) < N(q,/p,), it follows 
from the induction hypothesis that r = s and for i = 1,2,...7, h, =k;, and p, and q; 


are associates. Hence, by induction, we are done. rT 


Lemma 12.47 For each Gaussian prime x there exists a rational prime p such that 


m| p. 


Proof. Let x be a Gaussian prime and set n = N(x)= 2%. Let n= p,p)--: p, bea 
factorization of n into rational primes. Clearly 2 |. Hence, by an easy extension of 


Proposition 12.45, there exists a rational prime p, such that z | p,. . 


Theorem 12.48 (Gauss, 1801) The Gaussian integer z is a Gaussian prime if and only 
if one of the following holds: 

(a) = is 1—i or an associate; 

(b) z isa rational prime of the form 4k +3 or an associate; 


(c) N(x)= p, where p isa rational prime of the form 44 + 1. 


Proof. Let x have one of the formats listed above. We show that in all three cases z is a 
Gaussian prime. 

If x is 1—i or an associate, then 2 is a Gaussian prime because its norm is 2 (see 
Corollary 12.35). 

Next, let 2 be a rational prime of form 4& + 3 and suppose x = a is a factorization 


of x. Then N(x) = x* = N(a)N(8). Since z has the form 4é + 3, it is not the sum of 


302 NUMBER THEORY 


two squares and hence N(a) # x # N(8), and it follows that either N(a) = 1 or N(8)=1. 
Hence one of & or f is a unit and so 7 is Gaussian prime. 

For the third case, let p be the sum of two nonzero squares, say a” and 6? so that 
p=(at biXa— bi) =N(a+ bi) = N(a— Gi) and set x= a4 + 61. Then z and its associates 
have prime norm and hence, by Corollary 12.35, they are Gaussian primes. 

Conversely, we now argue that if 7 = 4+ 6i is any Gaussian prime, then it must fall 
into one of these three categories. If 6 =0, then 2 =a is a rational prime as well as a 
Gaussian prime and by Theorem 12.10 p has the form 44 +3. If a=0, then z= bi 
and so 7 is the associate of a rational prime of the form 44 + 3. In both of these cases 2 
falls into the second category. 

Hence it may be assumed that neither a nor 6 vanishes. Set p = a’ + 6”. We now 
show that p is necessarily a rational prime. For otherwise there exist nonunit rational 


integers c and d such that 
(a+ bi)(a—bi)= p=cd 


which leads to two distinct (Gaussian) prime factorizations of p. Hence, if N(z) = 2, 
then 7 falls in the first category. Otherwise 7 has form 4k + 1 and it falls in the third 
category. . 
Proposition 12.49 If p isa rational prime of the form 44 + 1, then p is expressible as 


the sum of two squares in a unique way. 


Proof. The existence of such an expression was established in Theorem 12.10. Suppose by 
way of contradiction, that there exist a rational prime p and positive rational integers s, 
t,x,and y such that {s,¢}#{x, y}ands?+2? = p=x*+y?. Then, the classification 
(Theorem 12.48) implies that 


(s +it)(s —it) = p =(x +iy)(x —iy) 


are two distinct Gaussian prime factorizations of p, contradicting Lemma 12.11. Hence, 
each rational prime of form 4# + 1 is expressible as the sum of two primes in only one 


way. a 


We now show how the preceding material can be used to solve some Diophantine 


equations. 


Corollary 12.50 The equation e+1la y? has the unique integer solution x = 0, y = 1. 


EXERCISES 303 


Proof. Let x, y be an integer solution of the proposed equation. Let & be any common, 
nonunit, Gaussian divisor of x +i and x —i. Then & must divide (x + i)—(x —i) = 2i. 


A glance at the factorizations of Table 12.3 tells us that 8 is an associate of 1 +i. Hence 
(1 +i) | (x +i) (12.51) 


By Lemma 12.33, N(1 +i) | N(x +i), or 2| (x? +1). Hence x? +1 is even so that x 
must be odd. 
However, if x is odd, we have x” + 1 = 2 (mod 4) and so y? =2 (mod 4). This is 


impossible because the odd parity of x also implies that y would be even, in which case 


per(3) 0 (3) = (mod 4), 


thus contradicting the previous equation. 
Consequently, 5 is a unit and hence x +i and x —i are relatively prime Gaussian 
integers or, in other words, they share no prime factors. Since their product is a cube 


(Gaussian integer), it follows that each of the two is a cube so that 
x+i=(utvi)? = 0? —3uv* + (3u7v—-v°)i 


where u and v are real integers. Separating real and imaginary parts yields the simul- 
taneous equations x = u? —3uv? and 1 = v(3u* — v”). The second equation implies 
that » =—1 from which it follows that 4 = 0. Substituting into the first equation yields 


x = 0 and hence y = 1. 7 


We note that the unique factorization property of Z[+/—1] was used twice in the above 
proof, first to conclude that x +i and x —i are relatively prime and second to conclude 


that they are cubes of Gaussian integers. 


Exercises 12.4 


1. Find the number of all the Gaussian integers with norm less than for n= 


5,6, 7, 8,9 


2. Let g(n) denote the number of Gaussian integers of norm less that 7. Is the 


number of integers ” for which g(2) = g(n + 1) finite or infinite? 


3- Report on Gauss’s circle problem. 


304 NUMBER THEORY 


4. Solve the Diophantine equation x? + 1= y’. 

5- Solve the Diophantine equation x7 +1 = y?. 

6. Factor 100 into Gaussian primes. 

7- Factor 37 —9i into Gaussian primes. 

8. Factor 62+ 41i into Gaussian primes. 

9. Factor 1010+ 620i into Gaussian primes. 

10. Factor 537 + 266i into Gaussian primes. 

11. Apply the division algorithm of Lemma 12.37 to z = 8—2i and w=3 +i. 
12. Apply the division algorithm of Lemma 12.37 to zg =25—5i and w=2+i. 


13. Use the Euclidean algorithm to find the ccp of the following pairs of Gaussian 


integers: 
(a) (3+11i,4—2i) (c) (33+ 19i, — 16 + 37i) 
(b) (144+ 29i, 19 + 26i) (d) (26+ 19i, 29 + 14i) 


12.5 Eulerian Integers and Others 


It stands to reason that the same technique that was used in the resolution of the preceding 


proposition could also be brought to bear on such a Diophantine equation as 
pax? +2. (12.52) 
In such a solution the right-hand side could be factored as 
x? 4+2=(x+ V¥—2\(x— v-2). 


This requires a context for such expressions as x + ¥ —2, one which is easily provided by 


defining the Eulerian integers as 
Z[v-2] ={ u+vv—2 | uveZ}. 


It is easily verified (Exercise 12.5.1) that this set is closed with respect to addition, sub- 


traction, and multiplication. See Table 12.4 for a list of small primes of Z[-/—2]. 


EULERIAN INTEGERS AND OTHERS 305 


Let us assume for the moment that Z[+/—2] has unique prime factorization and see 


whether or not the uniqueness of such a factorization would lead to the solution. Here 
x? +2 =(x + V—2)(x — V—2) 


and we must show that the two factors on the right are relatively prime in Z[/—2]. 

Suppose, by way of contradiction that 6 is a prime in Z[V—2] that divides both 
factors. Then 6 | 2-2. Since 2/—2 = ay AS is a prime power, 5 must be +/—3, 
Consequently /—2 | y? and hence ~2 | y. Taking norms we conclude that 2 | y. This, 
however, implies that x? = 2 (mod 4), which is impossible. 


Finally let us solve the equation 
x+V¥—2=(utoV~2) = 03 + 3u20V—2 + 3uv*(—2) + v?(—2)V¥ 2. 


Separation of real and imaginary parts yields x = u(u? —6v?) and 1 = o(3u? — 207). 
‘The second equation implies that v = +1 and « =+1. This gives x =5 and y = +3 as 
the only solutions of Equation 12.52. 

Of course, it is still necessary to prove that Z[/—2] does possess the unique prime 
factorization property, and we hasten to do so. Observe that the elements of Z[/—2] 
form a rectangular lattice in which each cell is a rectangle of dimensions 1 x V2 (see 


Figure 12.6). 
Lemma 12.53 The restriction to Z[V—2] of the function 


N(a+ 6V—2) =a? +26? foralla,be C 


is multiplicative. 


Proof. See Exercise 12.5.17. a 


‘The proof that Z[/—2] has a division algorithm is very similar to that of the Eulerian 


integers. 


Lemma 12.54 (Division Algorithm) Let z and w be two nonzero Eulerian integers 


such that N(z) > N(w). Then there exist two Eulerian integers g and r such that 
z=qwt+r and N(r)<N(w). (12.55) 


Moreover, (z, w) =(w, r). 


306 NUMBER THEORY 


y 
t 24 5if2 
. o7t . : 5 P 
-144iv2 [ 34+4i/2 
5 
4° 
a) i a re a ne 
af 
E 6+iv2 


Figure 12.6 The Eulerian integers Z[/—2] 


Proof. Let z=a+bV—2 and w=c+dV-—2. Then 


z _atby—2 


= = ——= =s+try-2 
w c+d/—2 : 


where 
ac+2bd bc—ad 
= and t= 


~ 242d? © 2 42d?" 


Let s’ and r’ be rational integers that are closest to s and ¢, respectively. Then clearly 


Thus, we could think of g = s’ + ¢’/—2 as the Eulerian integer that is closest to z/w. 


Set 


e=—-gq (12.56) 


EULERIAN INTEGERS AND OTHERS 307 


and note that 
N(o) =(s—s'P +(#-2’P < () + (3) ai 


By Equation 12.55 2 = qw + ew where z, w, and gq are all Eulerian integers, so that 


eq is also a Eulerian integer for which 
N(ow) =N(o)N(w) < 1-N(w) = N(w). 


So if we set r = ow, we are done with Equation 12.55. 

As for this theorem’s last equation, it follows from z = qw +r that every common 
factor of z and w is also a common factor of w and r, and, vice versa, every common 
factor of w and r is also a common factor of z and w. The desired equality follows 


immediately. = 


For example, to apply the Euclidean division algorithm to z =—15— /—2 and w= 


2-3-2, here 
s+tV—2 =(-24/22)—(47/22)i 


and hence 


g=s'+t/¥—-2=-1-2V-2 


and 


r=z—qw =(-15—v—2)+(1+2V—2\(2-—3vV—2) =— 


For another example, to find the ccp of (—115 —31¥—2) and (54—27V—2), the 


first iteration of the Euclidean algorithm is 
-115—31¥—2 = (-1— V—2)(54—27V—2) + (-7 —4i) 
and the second one 
54 —27V—2 = (-2+ 5v—2)(-7 —4i). 
Hence the required ecp is —7 — 4i. 


Proposition 12.57 Z[+/—2] has unique factorization. 


308 NUMBER THEORY 


It is known that for every odd prime p of the form 84+ 1 or 84 +3 the equation 
p =x? +2y* has a solution. It follows from the above proposition that this solution is 
unique. 

The mathematical structure Z[ V3] is defined in a manner similar to the Gaussian 
integers Z[V-1 ] and the Eulerian integers Z[V—2] . It is tempting to speculate that this 
structure also enjoys unique factorization. However, the argument of Lemma 12.37 fails 
in this new context. In fact, the equation Z{V-3] does not have unique factorization as 


is made clear by Table 12.5 and the factorizations 
4=2x2=(14+ ¥—3)(1—v—3). 


Euler, as well as other mathematicians of the time, was unaware of this fact and, as a 
result, produced several faulty proofs, including one that purported to show that the 
equation x4 y = z> has no nonzero integer solutions. The structure Z[v-3] can be 
visualized just as Z[/—2] was, with the sole difference being that the tiling rectangle has 


height V3 rather than V2. 


Exercises 12.5 


1. Find the number of all the Eulerian integers with norm less than 5, 6, 7, 8 or 9. 


2. Let f(m) denote the number of Eulerian numbers of norm less than x. Is the 


number of values 7 for which f(2) = f (+ 1) finite or infinite? 
3- Report on an Eulerian version of Gauss’s circle problem. 
4. Solve the Diophantine equation x* +2 = y*. 
5- Solve the Diophantine equation x* +2 = y?. 
6. Factor 100 into Eulerian primes. 
7. Factor —8 +7 —2 into Eulerian primes. 
8. Factor —22—+/—2 into Eulerian primes. 
9. Factor —26 + 43+/—2 into Eulerian primes. 
10. Factor —394—526+/—2 into Eulerian primes. 
11. Apply the division algorithm of Lemma 12.54 to z =5+3—2 and w =2+ Y—2. 


12. Apply the division algorithm of Lemma 12.54 to z = 25—5V—2 and w=2+ 
v¥—2. 


309 


EXERCISES 


ws 

Z 

a aD a fs a 
aa bg 3 Cah a be 
Z—fE+S 

Cfvre 

(Z=/E+ 1)=/ =7Z-f +9 
lZ—f — 1) t= =C-f¥ +7 


AZ—/ — 1): (t+ 1), 2-/ =9 


se Ca ct se Se fe a 


si ojde 


[7—/]Z ul suoneziioiey aurtid sulog = HZ 1 aTqey, 


0s (Z=/ —€)-(Z=f + 1)=7%f/T+S T—fT+E 
6h (Z-/ +€)- (ZA +I) =Z-f/ptl €€ aay al 
8h Raat as 2 SAM © Sc aa cl OB ek Se Ca 
by (Z—/ -1)-=Z-f +S Tf +e 
ee. Meek es PS tefete” Ze af YREfeHt 
Ip S SZ dz—/\)= 
8E Gf-DEf\-Ht-f/tre 57 (tf Veep TP te 
ZfE+1 61 a al 4 

9€ tf: (=f —-1)-(@—P + I =Z—PE wf tl 
bE AZ—f + It=f =U +h 81 wf 
wi0ou SIOIDEJ uIOU SIOIIEJ 


ZI 
91 
ai 
Il 


N A ST Oo CO DH 


wiI0U 


310 


13. 


14. 


15. 


16. 


17. 


NUMBER THEORY 


norm prime norm prime 


3 v3 25 5 

4 2 31 2437-3 
4 1+ /-3 37 5 +2 /-3 
7 24 f=3 43 443-3 
13 Re 61 7+2¥-3 
19 447-3 


Table 12.5 | Some prime numbers of Z[/—3] 


Use the Euclidean algorithm to find the ccp of the following pairs of Eulerian 
integers: 

(a) (11V—2,5-2V—2) 

(b) (8+29/—2,14 + 26-2) 

(c) (45+ 19/2, -44 + 37 ¥—2) 

(d) (32+ 19/—2, 34+ 14/2) 

Let g(z) denote the number of Eulerian primes of norm less that 7. Is the number 
of values ” for which f(”) = f(n + 1) finite or infinite? 

Since Z[ /—3] does not have unique factorization, the procedure used in the proof 
of Lemma 12.37, applied to this number system, must fail to produce a Euclidean 


algorithm. Explain where the failure occurs. 


Prove that if 
—1l+v-3 
ee eee 


then Z[«] has unique factorization. (Hint: show that w =—-l—-w.) 


Prove Lemma 12.53. 


12.6 What Is the Essence of Primality? 


In his groundbreaking article of 1847 On the Theory of Complex Numbers, E. E. Kummer 


(1810-1893) wrote: 


I have observed, however, that even though f(a) cannot in any way be broken up into 


complex factors, it still does not possess the true nature of a complex prime number, for, quite 


norm — divisors 
4 2 
5 v5 
6 1475 
9 2+7-5 
3 
14 347-5 
16 4=?? 
20 27-5 
21 1+2/-5 
4+/-5 
24 2427-5 =2(1+ V5) 
2 5 =—-V=5" 
29 3427-5 
30 5+ V-5 
36 6 =(14+. ¥—5K1— 5) 
6=2-3 
41 6+ V5 
45 3-5 
5+ 2-5 = ¥—5(2— ¥-5) 
46 1+ 37-5 


WHAT IS THE ESSENCE OF PRIMALITY? 


nhorm 


49 


54 


56 
61 
64 
69 


70 
80 
81 
84 


86 
89 
94 
96 
100 


311 


divisors 


2437-5 

7 

3437-5 = 3(1+ V5) 
7+ ¥—5 =(2- v-5)-(1+ v-5) 
6+2V—5 =2(3+ V—5) 
4437-5 

8=23 

7 +275 

8+ /-5 

5 +3V—5 = ¥—5(3-— V-5) 
4/5 =2 f-5 

1447-5 =-(2- V-5) 
244 /-5 =2(11+2V—5) 
8+ 27-5 = 2(4+ /—5)P 
94+ V5 

34+4/-5 

74+3V-5 

44+4/—5 =27(1+ ¥—5) 
10=2?/-5° 


Table 12.6 Some prime factorizations in Z[ /—5] 


commonly, it lacks the first and most important property of prime numbers; namely, that the 


product of two prime numbers is divisible by no other prime numbers. 


‘The readers have already encountered this “most important” property as Proposition 12.43, 


which can be summarized as 


plab => pla or 


pls (12.58) 


where a and 6 are arbitrary (rational) integers. Most introductory books on elementary 


number theory begin with some definition of the integers, go on to prove the Euclidean 


312 NUMBER THEORY 


Jy 
7 + 6 
5 
4 
4+4w 
3 64+ 30 
2 3+ 20 
OLE ‘ 3 s . . . 
att PAUP gE er pritrigruirrizgitriigia 4 
-l 1 2 3 4 5 6 
3-0 
“4 : A : é , . F 


Figure 12.7 The Eisenstein Integers 


algorithm, and then use this algorithm to prove Lemma 12.37. In Euclid’s Book VII, 
Proposition 30 says 
If two numbers by multiplying one another make some number and any prime measure’ the 


product, it will also measure one of the original numbers. 


This is clearly equivalent to Property 12.58. Some analog of Lemma 4.6 or Proposi- 
tion 12.43 is then commonly used to deduce the Fundamental Theorem of Arithmetic 
(Theorem 4.9 and Proposition 6.9). It must be stressed here that for reasons unknown 
Euclid did not state the Fundamental Theorem of Number Theory explicitly. There is 
evidence to suggest that we today evaluate the subjective question of “which propositions 
are important and which are uninteresting” differently from Euclid. 

There are, however, other number systems that admit of primes as well. As noted 
above, Gauss studied the number system bearing his name in depth, going so far as to 
give a detailed classification of its primes. Just as the Law of Biquadratic (or Quartic) 


Reciprocity produced the Gaussian integers as a by-product, so did the search for a Law 


"Divide without remainder. 


WHAT IS THE ESSENCE OF PRIMALITY? —- 3313 
of Cubic Reciprocity motivate the definition of the system 
Zo] ={at bo| a,b <eZ} 


where w is the cubic root of unity defined in Figure 12.7. In fact, Kummer greatly 
generalized these two number systems when he worked on the factorization of the elements 
of 

Zz] = { a, +aCte-+a, 0" | a, € z} 


where ¢ is any 7-th root of unity. There he discovered a surprising fact: for 2 = 24 the 
number system Z[Z] does not have the desirable quality of unique factorization. The 
demonstration is rather technical and would take us far afield. Fortunately a much simpler 
example is available. In Z[vV—3] we have the following distinct prime factorizations of 
4: 

2-2=(1+ vV—3)(1— V—3). 


This equation is disturbing because it implies that 2 | (1 + /—3)(1 — /—3) even though 
241+ ¥—3 which runs counter to Property 12.58. 

One way to describe what Kummer did is to say that when he realized that the prime 
numbers were “misbehaved” he replaced them with “new and improved” versions. To 
be precise, he changed the definition of primes to the following: A number p is prime if 
whenever it divides a product a6 it must divide either a or 6. That the rational primes 
of Z have this property follows immediately from Lemma 4.6. The converse is also true 
and its proof is relegated to Exercise 12.6.1. 

For the remainder of this book we shall refer to the traditional primes that have 
nondivisibilty as their defining characteristic as indecomposable integers. For example in 
Z{v¥-3] , where 

2-2=(1—v—3)(1+ v—3), 


each of the factors is an indecomposable nonprime number. They are indecomposable 
because of Table 12.5 (where they are listed as primes). To see that they are no longer 
to be considered as primes note that each has norm 4, but Z[v—3] has no elements of 
norm 2. 

However, unique factorization is a very desirable quality in number systems and Kum- 
mer’s solution was to enlarge the context by positing the existence of a multitude of ideal 


numbers in each such system. Technically speaking, these ideal numbers were undefined, 


314 NUMBER THEORY 


only their interactions with the original integers in the system in question are specified. 
More precisely, Kummer defined what it means to say that a number is divisible by an 
ideal prime number.” This was fairly standard practice in the nineteenth-century mathe- 
matical world and examples of this type of reasoning are common: such was the case in 
this text with the complex numbers and the Galois fields. Kummer, Galois, and Euler 
were not concerned with the “real nature” of their creations, just with how to use them 
in mathematical reasoning. 

The second half of the nineteenth century witnessed a shift toward absolute rigor 
amongst mathematicians. As part of this trend Dedekind suggested that in some contexts, 
at least, new elements could be represented by sets of old ones. The reader may be familiar 
with the method of Dedekind cuts developed by Dedeking which defines irrational real 
numbers with certain sets of rational numbers. A similar tack was taken by Dedekind for 
the interpretation of Kummer’s ideal numbers. They should be sets of rational integers. 
These are the ideals of the next chapter. 

In the event, the mathematical establishment declined Kummer’s approach, though 
not his mathematical results, in favor of Dedekind’s approach which is the content of 
the next chapter. These doubts about the legitimacy of ideal integers notwithstanding, 
Kummer’s work is justly lauded for its originality, depth, and influence. His “definition” 


of ideal numbers will be discussed in greater detail in the next chapter. 


Exercises 12.6 


1. Let q¢>1 bea positive integer of Z such that for any two integers a and 6, q| ab 


implies that either g | a or g | 6. Show that q is a prime of Z. 
2. State and prove the converse of Exercise 12.6.1. Let 2 > 1 and k > 1 be positive 
k bf 


integers with the following property: For every a, 6 € Z, if n|a then either 


n|a* or n| 6*. Then there exists a prime integer p such that n € { p, p*,..., p*}. 


3- Prove that the positive integer 7 > 1 is divisible by a square if and only if there 


exists a number m such that 7} m and n| m’. 


4. State and prove an analog of Exercise 12.6.3 for divisibility by p*. 
Chapter Summary 


The topics of Pythagorean triples, Sums of Squares, Quadratic Reciprocity, Gaussian 


Integers, and Eulerian Integers are each natural generalizations (or variations) on each 


*See Edwards, p. 325. 


CHAPTER SUMMARY 315 


other. All have surprising ties to geometry. They also motivate a reexamination of the 


notions of primeness and irreducibility. 


Chapter Review Exercises 


Mark the following true or false. 


I. 


There is a positive integer that belongs to infinitely many Pythagorean triples. 


2. There is a prime number p that is expressible as the sum of two squares in infinitely 
many ways. 
3. If p isa rational prime, so is 2p. 
2243 \ _. 
a ( 333 )= : 
5- Akillion is a number so large that it will kill you. 
New Terms 
associates, 295 norm, 295 
Eulerian integers, 304 primitive, 274 


Gaussian integers, 294 


Gaussian primes, 294 


Pythagorean triple, 274 


quadratic nonresidue, 281 


indecomposable integers, 313 


irrational, 294 


lattice point, 292 


quadratic residue, 281 


rational integers, 294 


Legendre symbol, 285 rational primes, 294 
method of infinite descent, 283 units, 294 
Supplementary Exercises 
1. What is the sum of an Eulerian and an Eisenstein number? 
2. Is there only one zero or are there several? Could there be an infinite number of 
zeros? 
3. Write a script that will list the primes of Z[-/—6]. 
4. Explain why the Diophantine equation x” + 61 y? = 1 can have only a finite number 


of solutions, whereas x? — 61 y? = 1 has infinitely many solutions. 


Chapter 13 


THE ARITHMETIC OF IDEALS 


S ince Z[Vd] does not, in general, have the property of unique factorization with 
respect to prime numbers, we added new primes with respect to which Z[ Vd] does 


have unique factorization. 


13.1 Preliminaries 


It would be natural at this point to go on and generalize the number systems of the 


previous chapter to the general form 
Z[Vd]={x+yVd | xyeZ} 


where d is a negative integer which is square-free. However, for pedagogical reasons, it is 
better to restrict attention to the systems in which d = 2,3 (mod 4) and d is square-free. 
The first seven such values of d are —1, —2, —5, —6, —10, —13, and —14. For these 


values of d let 


Z[Vd]={x+yVd|x,yeZ}. 


For example, when d = —1, Z[V/—1] consists of all the Gaussian integers. The elements 
of Z[ V2] are, of course, the Eulerian integers, e.g., 1 — 2/2 and 3+.5V/—2. Itis clear 
that for these d’s that Z[Vd] c C, that Z[Vd] ¢ R, and that Z[ Vd] is a subset of C 
that is closed with respect to the operations of addition, subtraction, and multiplication. 
The reason for excluding the case d = 0 (mod 4) is that these numbers are divisible by 
4 and hence are not square-free. The reason for the exclusion of numbers that are not 


square-free is that when d is not square-free, say d =a ps then 


xtyVd =x+ pyVaeZ[a] 


Introductory Modern Algebra, Second Edition. 317 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


318 THE ARITHMETIC OF IDEALS 


so that Z[d] c Z[a]. Finally, the restriction d # 1 (mod 4) is there to simplify the 
arguments of the main theorems. Recall that if «= x + yVd is any element of Z[ Vd], 
then the conjugate of a is @=x—yvVd. 


Proposition 13.1 Let «, 8 ¢ Z[Vd]. Then 
(a) a+P=a+B 
(b) a6 = a8 
(c) @=a 


(d) «=a if and only if ae Z. 
If «¢ Z[ vd] then the norm of « is defined as N(a) = a@ and the trace of a is defined 
as Tr(a) = a+. For example, if « = —2—5./—3, then 
N(a) = (—2—5v/—3)(—2 + 5 v3) = (-2)? —5°(W—3)? = 4+75 = 79 
and 
Tr(a) = (-2— 5/3) + (-2 + 5V—3) = 2(-2) =-4. 


Because d is negative, it is clear that if « =x + yvd, then 
N(a)=x?—dy’?>0 


and N(q) is a rational integer. 


Proposition 13.2 If a¢ Z[Vd], then Tr(a +) = Tr(a) +Tr(@) and N(af) = 
N(a) N(p). 


Proof. See Exercise 13.1.1. : 


The set Q[ Vd] ={x+yvVd 


quadratic field is indeed a field (see Exercise 13.1.3). 


x, y €Q} is called a quadratic field. Fortunately, every 


Proposition 13.3. Suppose x =x, +Kvd where «,,«, €Q and Tr(«), N(«) € Z. Then 
KK, EL, 
Proof. Let « satisfy the hypotheses of this proposition. Since Tr(«) = 2«,, «, is either a 
rational integer or half an odd rational integer. 

In the first case, the fact that N(«) = «{ — dx € Z implies that dx} is also an integer. 


If m/n is the reduced form of «,, then for some z € Z 


dm 


n 


=z or dm =zn’, 


EXERCISES 319 


which, unless 7 = +1, contradicts the fact that d is square-free. Hence «, is the rational 
integer m. Thus, «, and x, are both in Z. 
In the second case, where x, is half of an odd rational integer, set «, = 4/2 where a 


is an odd rational integer. Then, the norm of « is 


if 
Ki —dk= Za anez 
and hence 
a’ —d(2x,/ €4Z. 
This, however, contradicts the fact that a” = 1 (mod 4). . 


It is easy to see that Z[ Vd] is closed with respect to addition and subtraction. It calls 
for a little work to verify closure with respect to multiplication (see Exercise 13.1.2). As 
expected, the integers are not closed with respect to division. However, we do have the 


following useful fact regarding divisibility by rational integers. 


Proposition 13.4 If me Z and a=a+6vd, then m|a in Z[ Vd] if and only if 


m|aand m| 6 in Z. 


Proof. If m|a and m| 6 in Z, then clearly m|(a+6Vd). Conversely, suppose m | 
(a+ bVd), meaning that there exists an integer «’ = a’ + b’v/d such that 


a+ bvVd = m(a' + b'Vd) = ma + mb'Vd. 


Since ¥d is a complex irrational number, it follows that a = ma’ and 6 = mb’. . 


Exercises 13.1 


1. Prove Proposition 13.2. 


2. Prove that Z[ vd] is closed with respect to the operations of addition, subtraction, 


and multiplication. 


3. Prove that the quadratic field Q[ Vd] is a field. 


13.2 Integers of a Quadratic Field 


In this section, a theory of factorization of integers is developed that is analogous to 


that which we know to hold for the rational integers. This “natural” approach is then 


320 THE ARITHMETIC OF IDEALS 


demonstrated to lead to undesirable consequences. In the previous chapter units were 
defined for Z and Z|[i]; they are now extended to quadratic domains in general. 


A unit of Z| V¥—d] (resp. Z) is an element whose inverse is also in Z[V¥—d] (resp. Z). 
Proposition 13.5 An integer is a unit if and only if its norm is 1. 


Proof, It is clear that the units of Z are +1 and hence the proposition holds for the units 
of the rational integers. 

Turning to Z[ Vd], let u and v both be units of Z[Vd] such that uv =1. The 
multiplicity property of the norm then yields N(w) N(v) = 1. Since the norm is a positive 
rational integer, it follows that N(w) = 1. If d =—1, then w must be one of the four 
numbers +1,+i. If d <—1, then N(x + yVd) =x? —dy? which can only be 1 if 
x=+l1 and y=0. 

Conversely, +1 are clearly units of Z[Vd], d =—1,—2,—3,... and +i are additional 
units of Z[V—1]. rT 


Corollary 13.6 The units of Z[/—1] are +1 and +i whereas those of all other Z[ Vd | 
with negative d are +1. 

Let @ and £ be two integers such that for some unit 4, a = Gu. Then a and are 
said to be associates of each other. It follows from Corollary 13.6 that a+ bi, —b +ai, 
—a— bi, and 6 —ai are associates in .f_, and a, —a are each other's associates in Z[ Vd] 
for any d =—1,—2,—3,.... 

An element @ of Z[ Wd] is irreducible if for any factorization a = By at least one of B 
and y is necessarily a unit. The irreducible integers of Z are the associates of the classical 


primes, and those of Z[i] are the Gaussian primes. 


Theorem 13.7 If «¢ Z[Vd] has a norm which is prime in Z, then a is irreducible in 
VA LAP 
Proof. Let B and y be integers in Z[d] such that a = fy. By Lemma 12.53 


N(@) = N(6)N(y). 


Since the norm of «@ is a prime in Z, it follows that either N(@) or N(y) equals 1 and so 


one of them is a unit. Hence a@ is irreducible. a 


For example, the integer 3 + 2\/—5 is irreducible in Z[-/—5] because its norm is 


3? —(-2)?(-5) = 29. 


INTEGERS OF A QUADRATIC FIELD 321 


The converse to Theorem 13.7 does not hold, as is demonstrated by the following 
examples: 

In the Gaussian integers Z[/—1], N(3)=9. Hence, if 3 = af is any factorization 
of 3 into Gaussian integers, both « and 8 must have norm 3. Since there are no such 
integers, Z[v—1 ], 3 is an irreducible integer whose norm is not prime in Z. 

In the ring' Z[ V—6] the integer 2 is irreducible because the only integers whose norm 
properly divides N(2) = 4 are units. The same goes for 5. 

In the ring Z[V—6], the integer « = 2+ /—6 is irreducible. For, if a = fy, then 
N(8) | N(a). However, since there are no nonunits of Z[/—6] whose norms equal either 
2 or 5, it follows that either a or f is a unit. Hence a@ is irreducible in Z[v—6] . The 
same, of course, holds for 2— /—6. 


Theorem 13.8 Every nonzero nonunit in VAR ET is a product of irreducibles in VAN EA 


Proof. By induction on N(a). (See Theorem 4.9.) . 


For example, the integer 10 has two factorizations in Z[/—6]: 
2-5=10=(24+ ¥—6)(2—- v6). 


Above, it was demonstrated that the integers 2, 5, and 2+ ¥—6 are all irreducible in 
Z[V—6]. Hence 10 has distinct factorizations into irreducible integers. 

In 1847 the French mathematician Lamé famously announced that he had proven 
Fermat’s “Last Theorem.” His proof turned out to make the hidden assumption that if 
a perfect square is split into the product of two relatively prime integers, then each of 
the factors must itself be a square. This is easily enough proved to hold for the rational 
integers, but, as we now show, is false in general. 


Consider the equation 


(Bays 


Note that the right-hand side of this equation is a perfect square (i.e., the square of an 


integer). The two rational integers 2 and —3 are relatively prime because 


1 =(—1)(2)+(-1)(-3) 


‘At this point a ring is an algebraic structure with two binary operations resembling addition and multi- 
plication. A more formal definition appears in the next chapter. Some of the better-known rings are Z, Z,,, 


Z, Z[v—6], and Z[ Vd], as well as the rational, the real, and the complex numbers and the Galois fields of 
Chapter 7. 


322 THE ARITHMETIC OF IDEALS 


and hence their common factors are necessarily units. On the other hand, neither 2 nor 
—3 are squares nor are their associates. The reason for that is that the norm of any such 


“square root” must be either 2 or 3 and Z[/—6] contains no such integers. 


Exercises 13.2 


1. Find all the irreducible integers of Z[ ¥—5] of norm less than 25. 
2. Find all the irreducible integers of Z[.V—6] of norm less than 25. 
3. Factor all the integers of Z[./—5] of norm less than 30 into irreducible integers. 


4. Factor all the integers of Z[ V—6] of norm less than 30 into irreducible integers. 


13.3 Ideals 


By now the reader has seen enough examples to allow for the possibility that unique 
factorization might be the exception rather than the rule. In order to remedy this situation 
we introduce new entities called, for historical reasons, ideals. The four arithmetic 
operations are extended to these ideals, and eventually we demonstrate that these do have 


unique factorization into primes. 
Let A={a,,a,,...,0, } bea finite set of elements of Z[vd]. Then the (possibly 


infinite) set 
(A) = (00, 0050065 Og) = { 110, + Oto +1 hin | ae eZ[Vvd] | 


is called the ideal of Z[Vd] generated by A. 
For example, if A= {0}, then (A) = {0} for all d. If A= {1}, then (A) =Z[Vd] 
for all d. If A= {2}, then 


(A) = {a+ 6Vd | aand b are both even }. 


If A= {2,4}, then (A) = ({2}). If A= { V—5, 3}, then (A) contains the following 
integers of Z[ /—5]: 


0,3, V-5,3-—-V—5,44+3V—-5,-15+4V-5 


as well as 


(1+ ¥—5)3 +(2—V—5)V—5 = 8+ 5v¥—-5 


IDEALS 323 


and 


(8+ 5v—5)(1 — V—5) = 33 —3v—5. 


If A= {3,2}, then A= (1). The reason for this is that 1 = 1-3+(—1)-2€(A). 


Proposition 13.9 Suppose a and f are elements of the ideal a of Z[Vd]. ‘Then 


a+Beaand racea. 


Proof. By definition there exists a set A= {a,,a,,...,a,, } such that a = (A) and for 


each i = 1,2,..., m there exist integers 7,, 5, such that 


Then 


m 


we B= Dirt > =>" +5,)a;€a. 


i=] 


Moreover 


Proposition 13.10 If v isa unit of Z[Vd], then (uv) =Z[ vd]. 


Proof. Since u is a unit there exists an integer v such that uv = 1. By Proposition 13.10, 
since u € a, so is wv = 1 in a, and hence 1 ea. Then, however, by the same proposition, 


for any win a, w=a-lea. 2 


If A isa singleton, say A = {a}, then (A) = (a) is said to be a principal ideal. Clearly 
(2) and (102) are principal ideals of Z and {2-237 V—6) is an ideal of Z[/—6]. We 
give an example of an ideal which is not principal: Consider the ideal J = (2, /—6) 
of Z[/—6] and suppose that there is an integer « of Z[V—6] such that J = (a). We 
first note that the norm of every element of J is even. This is justified as follows. By 


definition, every integer in J can be expressed in the form 
(a+ bV—6)-2+(a' + b’V—6)- V—6 = (2a— 66’) + (26 +a’) V-6. 


This integer has norm 
(2a—6b') +6(26 +4’) 


which is clearly even. 


324 THE ARITHMETIC OF IDEALS 


Since 2 € J = (a), it follows that there is an integer 6 € Z[V¥—6] such that 2 = af. 


Since the norm is multiplicative, 
4=N(2)=N(@)N(P) 


and we may conclude that N(a) = 1 or 2. However, no integer in Z[V/—6] has norm 2 


(see Exercise 13.2.2) so that necessarily N(a) = 1. It follows that @ is a unit and hence 


I = (a) = (1). 


This means that 1 € /, contradicting the fact that all the elements of 7 have even norm. 


Thus, the ideal J = (2, ¥—G) is not principal. 


Proposition 13.11 Let a= (a,,a,,...,0,,) and 6b (8,,8,,...,8, ) be two ideals in 
Z[ Vd]. Then the following are equivalent: 

(a) acb; 

(b) each @, is in 6; 


(c) each a, isa Z| Vd ]-linear combination of the 8 ; 3 


Proof, (a) implies (b) is clear and that (b) implies (c) follows from the definition of ideals. 
For ¢ => a, Let @ be any integer of a. This means that « is a Z[vVd]-linear combination 
of the a,’s. By hypothesis, each a, is a Z| Vd ]-linear combination of the 8 ; $. It follows 


that « is a Z[ Vd]-linear combination of the B,’s. Hence, a€ b as well. 5 


Corollary 13.12 Suppose A=a,,a,,...,a,, and B=8,,f,,...,a,. Then (A) =(B) 
if and only if Ac (B) and Bc (A). 


The next corollary states that the addition of a multiple of one generator to another 


does not affect the span. 


Corollary 13.13 If a,,a5,...,4,.Y € Z[ Vd], then 


CAR TO gO) alt Peles ce nay a) P 


For example, to simplify 


(14 $0) 616 = 36 246), 10), 


IDEALS 325 
subtract twice the second generator from the third one to obtain 
(144 9V-6, 16-3V—6,—34, 10). 
Next add three times the second generator to the first one, resulting in 
(62, 16— 3-6, -34, 10) 
which has the same span as (2, 3/—6). 


Proposition 13.14 Every ideal of Z is principal. 


Proof. Let d be the least positive element of the ideal J of Z. If a is any integer of J 
there exist an integer g and a nonnegative integer r < d such that a= qd +r. Since 
a,d €1, it follows that r =a—qd ¢1. The minimality of d implies that r = 0 from 
which it follows that a= qd €(d), and hence J =(d). " 


Proposition 13.14 above implies that in order to find a nonprincipal ideal we must 


look elsewhere than Z and we now describe a nonprincipal ideal in Z[ /—3]: 
I =(2,1+v-3). 


If this ideal were principal, say J = {z), the generator z would have to be a common 
divisor of both 2 and 1 + /—3, both of which are prime (see Table 12.6). It follows that 
z would have to be a unit, i.e., +1. It is easy to see that 1 ¢ J if and only if —1 ¢J and 
hence it suffices to show that 1 ¢ 7. Suppose, by way of contradiction, that 1 ¢ J. Hence, 


there exist rational integers x, y, u, v such that 
1 =2(x + y¥—3)+ (1+ V—3)(u + vV—3). 
Equating rational and complex terms we get 


1l=2x+u—3v 


O=2y+utu. 


The subtraction of these two equations yields 1 = 2x —2y —4v, which is impossible since 
x, y,v are all rational integers. Hence (2, 1+ /—3) is a nonprincipal ideal of Z[./—3]. 


326 THE ARITHMETIC OF IDEALS 


We continue this exposition with the following observations about the ring of rational 


integers. 


Proposition 13.15 The correspondence a <> (a) between the nonnegative integers of 


Z and its ideals is a bijection. Moreover a | 6 if and only if (a) > (6). 


Proof. Clearly, if a= 6, then (2) = (6). If (2) =(6), then a= +0; since both a and 
b are positive, a = 6. This takes care of all the ideals of Z for the first part. 

If a| 4, then there exists an integer ¢ such that 6 = ac. Consequently b € (a) and 
hence (a) 2 (6). Conversely, if (a) 2 (6) then 6 € (a) and hence there exists an 


integer c such that ac = b, ie., a divides 6. a 
If A= {a,,4,,...,a,} and B={6,,6,,...,6,}, then the product A-B or AB is 
defined as 


A+ B= AB = (a, by, 4, by «150, 0,0, by dy dys 0.5 yb,)-..54,b),4, by ...,4,0,). 


For example, 


(2,14 V—5)? = (2,14 ¥=5)(2,1+ v5) 
= (4,2+2V-5,-4+2V-5) = (4,2+2V-5, 4+ 2V-5, 2) = (2) 


because each of the generators of the penultimate expression is divisible by 2 and, more- 


over, 2 is one of its generators. Also, 


(31+ V=5)(3,1-V-5) = (9,3-3V-5,3 + 3v-5,6) 
= (9,3-3V-5,3+3V-5, 6,3) = (3) 
because each of the generators of the penultimate expression is divisible by 3 and, more- 
over, 3 is one of its generators. 
Proposition 13.16 If (a) and (6) are two elements of the ring R, then (2)(6) = 
(ab). 


Proof: This follows directly from the definition of the multiplication of ideals. 2 


Let S={4,,4),...,4,} and T ={,,6,,..., 6, } be two arbitrary finite subsets of a 


ring. The two ideals A= (5) and B =(T) are said to be equal if they are equal as sets. 


IDEALS 327 


Equivalently, they are equal if S ¢ B and T C A. It is clear that permuting the order of 
the generators does not alter the ideal they determine. Moreover, if G = { g,, g)---» g }> 
G’ ={ g, B+.» B_,}, and g, €(G’), then (G) =(G’), in which case we say that 
(G’) is obtained from (G) by the removal of g, and (G) is obtained from (G’) by 
the addition of g,. 

For example, (2, 1 — /—7) = (2, 1+ /—7) is demonstrated by the following chain 


of equations: 
(2,1- v7) =(2,1- V=7, 1+ V=7) = (2,14 v7). 
To see that (2, 1+ /—5)(3, 1+ V—5) = (1+ V—5), compute 
(2,1+ ¥=5) (3,14 V5) = (6,2(1 + v5), (1 + v—5), 1 +2V—5—5) 


= (6,2+2v-5,3+3V-5) 
=(1+¥-5) 


since 6 = (1+ /—5)(1— V—5). 
To see that (2, 1+ /—5)(3, 1— /—5) = (1— V5), compute 
(2,14 V=5)(3,1-v=5) = (6,2-2V-5, 3 + 3v-5, 6) 

= (6,2-2V-5,3+3v—5) 
(6—(2—2V-5—(3 + 3V—5), 2—2V-5, 3 + 3-5) 
(1-V-5,2-2V-5,3+3V-5) 
=( 


1-5}. 


Proposition 13.17 If a, b, and ¢ are ideals of Z[ Vd], then ab = ba and (ab)c = a(bc). 


Proof. Let a= (01,@5,...,0,,): 0 = (Bi,By---» B,), and ¢= (7, Y.-+ Y,)- Then 


se nF Jee es en es a 


and 
(ab)c = (...(e,8 p= (...0,8,7))-.-) = a(be). . 


328 THE ARITHMETIC OF IDEALS 


For example, 


(3,14 V=5)(3,1-V-5) = (9,3-3V-5,3+ 3v-5, 6) 
= (9,3-3V-5,3+3V—5, 6,3) = (3). 
In Z[V—6] 
(2,14 v6) = (1). (13.18) 


To see this, note that 
7 =(1+ V=6)(1— V=6) ¢ (2,1+ v=). 


Hence 7 and 2 are also in that same ideal. However, since 7 and 2 are relatively prime 
in Z, there exist rational integers A and B such that A-7 + B-2=1, and hence Equa- 
tion 13.18 holds. 


In Z[-V—6] 
(2+ v-6,7+2v-6) = (3,1-v-6). 


This follows from the observation that each generator of one ideal is a Z[ ¥—6]-linear 


combination of the other ideal: 

3=(7+2V—6)—2(2+V—6) and 1—v—6=(7+2V—6)—3(2+ V—6), 
and, in the other direction, 

2+ V-6=3-(1-V—6) and 7+2V—6=3(1+2vV—6)4+4(1— V—6). 


Could the ideal in question be (1)? (See Exercise 13.3.8.) 


In Z[V—6], 
(4+ V-6,2- V-6,7+ V-6) =(3,1+ V-6). 


This is proved by exhibiting each generator of one ideal as a Z| ¥—6]-linear combination 


of the generators of the other ideal: 


3 =—2(2— V—6)+(7-2V—6) and 14+ V—6=2(4+4 V—6)-(7+ v-6), 


IDEALS 329 


and, in the opposite direction, 
4+V¥-6=1-3+1-(l+vV-6), 2—v-G=1-3-1-(1+ v-6), 


and 
7+ V¥—6=2-341-(1+ V6). 


Proposition 13.19 If an ideal in Z[ Vd] contains two elements of Z which are relatively 
prime, then the ideal is the unit ideal. Consequently, an ideal is the unit ideal if it contains 
two elements whose norms are relatively prime in Z. 
Proof. Let a and B be elements of the ideal J which are also in Z and are relatively prime 
there. Then there exist rational integers A and B such that 1 = Aa + BB. On the other 
hand every integer of the form Aa + Bf is contained in J. Hence 1 « J. By the defining 
properties of ideals Aa + BB € J, so it follows that 1 is also in 7, ergo J is the unit ideal. 
Suppose next that @, 8 € J are such that N(a) and N(@) are relatively prime. Since 
N(a@) = aa e€ J and similarly N(@) € J, it then follows from the first half of the proof first 
that 1 is also in J and second that J is the unit ideal. a 
Proposition 13.20 If all the generators of the ideal / are rational integers, then / is the 
principal ideal (4) where + is the greatest common divisor of the generators. 
Proof. Suppose I = (4, 4>,...,4,,) and let 4 be the greatest common divisor of all the 


4,’s (which are all in Z). It follows that 4 | a; for each i = 1,2,..., m, so that 


1 Giteeaes Ch) 


and hence J ¢ (4). To obtain the reverse inclusion, note that by Theorem 4.8 there exist 


rational integers A,, A,,...,.A,, such that 
h=A,a,+A,a,+---+A,4,. 
Consequently, for any integer a 
ah =(aA,)a, +(@A,)a, +--+ (aA, a, el 


and hence (4) cI. rT 
Proposition 13.21 Let « and f be integers of the ideal 7. Then (a) = (8) if and only 


if a and B are associates. 


330 THE ARITHMETIC OF IDEALS 


Proof. If one of « and B is 0, we may assume without loss of generality that 6= 0. In 


that case the following are equivalent: 


Thus, we may assume that neither a nor is zero. By Proposition 13.11, the following 
are equivalent 

« (a) = (8); 

* ae (B) and Be (a); 

© a= By and B=cary’ for some y,y’ €Z[ Vd]. 
Consequently, if (a) = (8), then a= By = ay’y from which it follows that y’y = 1 so 
that both y and y’ are units. 


Conversely, if « and f are associates, then a| 6 and B| a and hence (a) =(8). am 


Suppose a and b are two ideals of Z[Vd]. Their product ab is the set of all finite 


sums of the form 


m 
{Sov |e y, © 4, fab deoml. 
k=] 


The rationale behind this definition is that at the very least the “product” of a and b 
should contain all products of the form x y where x € a and y € b. However, this product 


should also be an ideal and so it must be closed under addition. Hence the summation. 


Proposition 13.22 Ifa=(a,,a@,,...,@,,) and b=(8,,6,,...,8,), then 


Gb = (aie. a6) tah) 


Proof. Let 77_, x,y, bean arbitrary element of ab. Note that each x, isa Z| Vd |-linear 
combination of the «,’s and each y, isa Z\vVd ]-linear combination of the B; ’s. When 
these multiplications are carried out, we obtain S77 , x,y, a8 a Z| Vd ]-linear combina- 


. > 
tion of the a, ;’s. Hence 


ab C (0B ys. jB ips OyB,)- 


IDEALS 331 


Conversely, every element of (@,8,,...,0,8 2222 Cnn ) can be written in the form 


m n 
Qa dg Vif %IBy (13.23) 
i=1 j=l 
Since Vij; €O and IE b, it follows that the sum of Equation 13.23 is in ab. r 


Consider the product of the ideal a = (5+ V/—6,2 + /—6) and the ideal b = (4+ 
V/—6,2—/-6). By definition, ab is generated by the products (5 + /—6)(3 + V—6), 
(5+ /—6)(2— V—6), (2+ V—6)(4+ V—6), and (2+ /—6)(2 — V—6), so that 


ab = (144 9-6, 163. 56,2 46, 10), 


which, according to an earlier example, simplifies to (2,3v—G6). 


Corollary 13.24 For any two ideals a and b, ab = (0) ifand only ifa=(0) or b=(0). 


Proof. It follows from the definition that if either a or 6 is (0), then ab = (0). Con- 
versely, suppose ab = (0) but both a and 6 are nonzero. Then there exist nonzero 


integers « € a and Be b. Hence @f is a nonzero element of ab, contradicting the fact 
that ab = (0). . 


Corollary 13.25 For the ideal a = (a,,a,...,«,,) and the principal ideal ¢=a(y), 
ay = (yay, 7oy,..-s 7m) 


In particular, the unit ideal (1) = Z[ Vd] is an identity element for the multiplication 
of ideals, and (a, 8) = (a)({B). 


For an ideal a, its conjugate ideal is the ideal a= {a|aea}. 


Proposition 13.26 Ifa=(a,,a,,...,a,,),then @=(@,@,...,@,,). In particular, if 


a= (a) isa principal ideal, then so is a the principal ideal (a). 


Proof. By definition, 


0={ Son root (Vb 


332 THE ARITHMETIC OF IDEALS 


Making use of the additivity and multiplicativity of conjugation we obtain 


The remainder of this proposition is relegated to Exercise 13.3.6. 2 


When an integer is equal to its conjugate, they must both be in Z. This is not the case 


for self-conjugate ideals. Consider the ideal (2, /—6). Its conjugate is 
(2, J) = (3, v-6) = (2,-v-6) 


and so we see that even though the ideal is self-conjugate, it need not consist of rational 
integers only. 

The ideal (5, 2+ /—6) is not principal and is not its own conjugate. We first show 
that 


(5,2+ V6) (5,2—Vv-6) = (5). (13.27) 
Indeed, 
(5,24 V-6)(5,2—v=6) = (25, 10-5v=6, 10+ 5v=6, 4—(-6)) 
(5) (5,2—v-6,2+ v=6,2) 
= (5) (1) = (5). 


Now, suppose by way of contradiction that the given ideal is principal. Then there exists 


an integer a such that (5,2+ v6) = (a). Hence (a)(@) = (5). Thus, 


(5) = (a) (@) = (aa) = (N(a)) 


and it follows that N(a) = +5. Since Z[ ¥—6] contains no elements of norm 5, we have 


our contradiction. 


IDEALS 333 


To show that the given ideal is not equal to its conjugate, we suppose, again by way of 


contradiction, that it does equal its conjugate ideal. Then Equation 13.27 becomes 
(5,24 v=6)? = (5). 
When we compute the above squared ideal (5,2+ /—6)? directly, we obtain 


(6)= (5,2+ V-6)? =(5,2+ v—6)(5,2+ v-6) 
= (25, 5(2+ V=6), (2+ v-6)) 
= (25,5(2+ V6), 4+4v—6-6) 
= (25; 5(2+ V—6),-2+ 4V-6). 
However, 5 is not an integer factor of —2 + 4/—6 and this is the required contradiction. 


Let 
p=(5,2+v-6) and q= (2,24 v6) 


so that 


p=(5,2-v-6) and q=(2,2-v-6). 


Then it can be demonstrated (see Exercise 13.3.7) that 
pp=(5), qq=(2), and pq=(2+v-6). (13.28) 


Set a| b if b = ac for some ideal c. We say that a divides 6 or is a divisor or a factor 
of 6, or that b is a multiple of a. 
For example, since (2)(3) = (6), it follows that (2) | (6) and (3) | (6). Similarly, 


since 


(1+2V-6\(1+ V—6) =—114+ 37-6 


it follows that 
(1+2v-6)|(-11+3V-6) and (1+ V-6){(-11+3v-6). 


Proposition 13.29 If a,8<¢ Z[ Vd], then (a) | (8) if and only if «| B. 


Proof. Suppose | 6 in Z[ Vd]. This means that there exists y ¢ Z[ Vd] such that B= ary. 
Then, by the above proposition (8) = (ay) = (a)(y). Hence (a) | (8). Conversely, 


334 THE ARITHMETIC OF IDEALS 


assume that (a) | (8). Then there exists an ideal ¢ such that (8) = (a)c. By definition, 


there exist integers y,,7>,..->7 , such that 


were): 


By Corollary 13.25 


(8) = (2) (1p 1p %p) = (arp ary +0 87,)- 


Hence, like all the elements of (8), the integer f is a VANE ]-linear combination of the 


ay, 'S, say 
P P 
B= > 8,07, — a> SY: 
k=1 k=l 


Consequently a | 6. . 


Proposition 13.30 Let « be an integer and 6 = (f,,8,,...,8,) an ideal in Z[ Vd]. 
Then the following are equivalent: 

(a) (a) |b; 

(b) a|B; torallige—slne see ts 

(c) (a)Db. 


Proof. (a) => (6): Suppose (a) | 6. This means that there is an ideal ¢ = (7,,7,-.-> 7%) 


such that 


b= (a) ¢= (ar, a73-..,07,)- 


Hence every element of 6 is divisible by a, including the 8's. 

(4) => (c): Suppose a | 8; for all 7 = 1,2,...,”. Then clearly a must divide all the 
elements of (8;,85,.-.,8,, )- Consequently, since (a) consists of all the integers divisible 
by a, we have (a) 3 6. 

(c) = (a): Suppose (a) > 6. Then every integer of b is a multiple of a. The set 


b/a={ m/a| me} 


is an ideal with generating set B,/a,8,/a,...,8,,/a. By the definition of the product of 
ideals, (a)(b/a) = b, and hence (a) | 6. 7 


IDEALS 335 


Proposition 13.31 Let a and 6 be two ideals of Z{Vd]. Then 
(a) ifa|b, thenadb; 
(b) ifa|b and b|a, then a=b. 


Proof, Since (b) follows immediately from (a), we only prove part (a): If a|b, then 
there exists an ideal ¢ such that b= ac. Suppose that a= (a,,a,,...,@,,) and c= 
Gate eters @ : ). In that case b is generated by the mp products a;7,. However, each 
such product belongs to a (and also to ¢, but that’s not important to us). It follows that 


every element of 6 is also an integer of a. a 


To summarize, it has been shown that every number system Z[ vd] can be placed 
inside a new system of ideals in such a way that each element « € Z[ Vd] is placed on the 
ideal (a). These ideals are subject to a binary operation, also called ideal multiplication, 
such that for any two integers a, B € Z[vd], (a)(B) = (ap). 

In other words, the multiplication of ideals “faithfully represents” integer multiplica- 
tion in the context of ideals and their multiplication. The advantage to algebraists is that 
the multiplication of ideals does possess unique prime factorization. 

We conclude this section with two more examples for the reader to practice on: 

Let p = (5,2+ /—6) and q= (2, /—6). Standard calculations show that pp = (5) 
and qq = (2). Hence, N(p) = 5 and N(q) = 2, implying that 


(10) = (5) (2) = ppaqa 


is a factorization of the ideal (10) into prime ideals. The principal ideal (2+ /—G) has 


norm 10. 


Let p = (5,2+ V—G) and q=(7,1+V—6). Then calculations show that 
pb=(5), qd=(7), pq=(35,-13+V-6), and pq=(35,-13-v-6). 
Hence the principal ideal (35) can be factored as 


(35) = (5) (7) = pbaa 


as well as 


(35) = (35,-13 + V=6) (35, -13— V6) = papa. 


336 THE ARITHMETIC OF IDEALS 


Exercises 13.3 


1. Decide whether the following assertions are true or false in Z[ /—6]. 
(a) (2,1+ v—6) = (2,1-Vv-6) 
(b) (3, 1+ 7-6) = (3,1- v-6) 
(c) (2,14 V—6) = (4,2+2v—6) 
(d) (2,1+ V—6) = (2,1+ v—6) 
(e) (29,32-27V-6) = (3+2v-6) 
(f) (49,21-7V—6, 21+7V—6, 14) = (7) 
(g) (3-V-6,1+2V-6) = (7,3- v-6) 
2. Decide whether the following assertions are true or false in Z[ /—5]. 
: (2,1+ ¥—5) =(2,1-v7-5) 
b) (3,1+ V—5) = (3,1-v-5) 
a (2,14 V—5) = (4,2+2V-5) 
(d) (214+ ¥—5) =(2,1+ v5) 
(e) (29,32—27V—5) = (3427-5) 
(f) (49,21 -7V—5, 21+7V-5, 14) = (7) 
(g) (3—v—5,1+2V-5) =(7,3-v-5) 


3. Which of these ideals are principal and which are not? 


(a) (9) f) (5, V-6) 

(b) (2—v-6) . (2,1+ 7-6) 

() (6,8,2+6¥—6) (h) (3,1+ v—6) 

(d) (3,3v—6) () (7,1+2v-6) 

(ce) (3, V—6) (j) (21,9+3v7—6, -2+4V/-6) 
4. Which of these ideals are principal and which are not? 

(a) (9) (f) (5, v5) 

(b) (2-7-5) (g) (21+ 7-5) 

(c) (6,8,2+ 67-5) (h) (3,14 V—5) 

(d) (3,3v-5) (i) (7,1+2V-5) 

() (3, v-5) (j) (21,9+3V-5,-2 + 47-5) 


5- Prove that Z[—6] has no element of norm 2. 


CANCELATION OF IDEALS 337 


6. Complete the proof of Proposition 13.26. 
7. Supply the missing details for the proof of Equation 13.28. 
8. As proven in the text, in Z| /—6] 


(2+ ¥-6,7+2v-6) =(3,1-v-6). 
Is this ideal (1)? 


13.4 Cancelation of Ideals 


If ¢ #0 and @ and 6 are all complex numbers, then it is well known that 
ac=bc => az=b. 
This property is not to be taken for granted. For example, in Z, 
2-3=0=4-3 bute 244 (mod 6). 
An ideal ¢ of Z[Vd] is a cancelable ideal if for every two ideals a and 6 
ac=bc => a=b. 


It is clear that (0) is not cancelable since a(0) = (0) = 6(0) for any ideals a and b. 
Proposition 13.32 Nonzero principal ideals are cancelable. 


Proof. Let a = (a,,@,.-.,@,,) and b= (B,,8,,...,8,) be ideals and let (y) be a 
nonzero principal ideal such that a{y) = 6(y) and consequently 


(ray, YAqr-0-5 ya,,) = (6), YB +++ YB,)- 
This implies that every ya, is a Z[vd ]-linear combination of the y# F ’s, say 


n 
ve, =) 14, ;7B;- 
j=l 


Cancelation by the nonzero integer y displays a; as a Z[ Vd ]-linear combination of the 


B,’s. Hence ac b. The reverse containment is proved in a similar manner. 7 


338 THE ARITHMETIC OF IDEALS 


Corollary 13.33 Every nonzero ideal which has a principal multiple is cancelable. 


Proof. Let ¢ be a nonzero ideal and suppose there exists another ideal c’ such that ce’ = 
(vy) for some nonzero integer y. If a and b are any two ideals such that ac = be and 
hence 


ace’ =bee’ or aly) =b(y). 


Since (y) is cancelable, it follows that a= 6. 2 
Lemma 13.34 Ifa= (a, B) is an ideal of Z[ Vd] with two generators, then 


aa = (N(a), Tr(a), N(A)). (13.35) 
Proof. Clearly 


aa = (x, B) (2, B) = (ad, a, Bez, BB) > (N(a), Tr(xB), N(B)) = (4) 


where, by Proposition 13.20, 4 is the greatest common divisor of the rational integers 
N(a), Tr(a), and N(@). This proves that the left-hand side of Equation 13.35 contains 
its right-hand side. 

To prove the converse it suffices to show that a is contained in the right-hand side. 
This, in turn, will follow from proving that 4 divides af which is equivalent to saying 


that 


ap 
36 VANEA 


This, however, follows from Proposition 13.3 and the equations 


aB aB+asB  Tr(aB) 
1( == oe eZ[Vvd| 


aB\ apap N(a) N(6) 
n(#)- ny ee 


For example, let a = (2+ 37-6, 4). Then aa = (58,8, 16) = (2). 


Theorem 13.36 Ifa= (a, ne Broce ip then aa is generated by the m rational integers 


N(a,), N(@),---, N(a,,) and the m(m—1)/2 rational integers Tr(a; &;) where i <j. 


CANCELATION OF IDEALS 339 


Sketch of proof. Proceed by induction on m and use the method provided in the proof 
of Lemma 13.34. a 


Corollary 13.37 If a is an ideal, then aa is principal. 


Proof. By Proposition 13.20, aa has a set of rational generators and aa is necessarily 


principal. ] 


Corollary 13.38 Every nonzero ideal of Z[ Vd] is cancelable. 


Proof. Let a be a nonzero ideal. By Corollary 13.37 ad is principal. In particular, a has 


a principal multiple. By Corollary 13.33, a is cancelable. r 
Corollary 13.39 If a and 6 are ideals of Z[ Vd], then 


a|b ifandonlyif a>6b. 


Proof. Suppose first that a = (0). Then the following are equivalent: 

«= alb; 

= (0) | 6; 

= b=(0); 

= (0) >6. 
Hence we may assume that a # (0). By Proposition 13.15, a| 6 implies a > b. To prove 
the reverse implication, suppose a > b. It then follows that aa > ba. By Corollary 13.37 
there exists an integer a such that aa = (a) and, hence, (a) > ba, which implies that 
(a) | ba. 

By definition there exists a nonzero ideal ¢ such that (a)c = ba. When both sides of 


this equation are multiplied by a, we get 
(a) ca = baa = b (a). 


Cancelation by the nonzero ideal aa = (a) yields ca = b and hence a|b. 7 


The gist of this surprising result is that amongst ideals “to contain is to divide.” This is 
somewhat counterintuitive because amongst the rational integers the divisor is generally 
viewed as being smaller than the dividend. In the context of ideals, however, the opposite 


happens: the divisor of an ideal is greater than the dividend in the sense that it contains 
the dividend. 


340 THE ARITHMETIC OF IDEALS 


Corollary 13.40 The divisors of an ideal a are precisely the ideals 6 such that 6 D a. 


In particular @ € a ifand only if a| (a). 


This corollary makes it easy to find examples of proper containment (or division) 
amongst ideals. If a= (a,,0,...,a@,,) and ada, then b=(a,,a),...,a,,@) isa 
proper divisor of a. 


The sum of the two ideals a and 6 is 
a+b={a+6|aeafeb}. 
Proposition 13.41 Ifa=(a,,a,,...,a,,) and b= (B;,8,...,8,,), then 


a+b= CCAR NT ot et 
Proof. Exercise 13.4.2 . 


Proposition 13.42 For two ideals a and b, the ideal a+ 6 is a common divisor of a 


and 6 which is divisible by all the other common divisors. 


Proof. It follows from Corollary 13.40 that a+ divides both a and 6. On the other 
hand, if ¢ divides both a and 6, then, by the same corollary, ¢ contains both a and 6 so 


that cD a+b. This, of course, means that ¢ divides a+b. r] 


This proposition motivates the definition of the greatest common divisor of the two 
ideals a and 6 as their sum a+b. 

For example, among the ideals of Z[/—5], the principal ideals (3) and (1+ /—5) 
have greatest common divisor (3, 1+ f/—-5 ), which is nonprincipal (Exercise 13.4.3). 


Corollary 13.43 Every ideal is a sum of principal ideals. 


Proof. Since every ideal in Z|vd | has the form (@,,a,...,a,,)» it follows that 


(a),a,...,4,,)=a,Z[Vd] +a,Z[Vd]+---+a,Z[Vd] 
=(a,)+(a,)+---+(a,,). . 


EXERCISES 341 


Exercises 13.4 


1. Find a single element of aa that generates it for the following ideals a: 


(a) (12,18) (c) (2+3v-1,6) 
(b) (12,18, 30) (d) (14+5¥—2,2-/-2) 


2. Prove Proposition 13.41. 


3. Prove that (3, 1+ /—6) is a nonprincipal ideal. 
13.5 Norms of Ideals 


According to Theorem 13.36, ad is a principal ideal that is generated by an element of 
Z. Since for any such rational generator g, (—g ) = ( g), we may assume that g > 0. 
The positive generator g is unique because if g and / are in Z and (g¢) = (A), then 
g =hu for some unit U that is necessarily rational. Hence u = +1. 

If a is an ideal in Z[V@] let N(a) denote the positive generator of aa. We call N(a) 
the (ideal) norm of a. 

For example, in Z[V—6], the ideal (5,2 + V—6) has norm 5. 
Proposition 13.44 ‘The ideal norm and the element norm agree on principal ideals. 
That is, if a= (), then N(a) =| N(a)|. 
Proof: We have 

aa = (a) (%) = (az) = (N(a)) = (| N(@))). 

The last term equals (N(a)), so N(a) = | N(a)| because both are positive and they generate 


the same principal ideal. a 
Proposition 13.45 If a and b are nonzero ideals, then N(ab) = N(a) N(b). 


Proof. We have 
(N(ab)) = abab = abab = aabb = (N(a)) (N(b)) = (N(a) N(b)) 


from which it follows that N(ab) = N(a)N(6) since both are positive generators of the 


same principal ideal. 5 
Corollary 13.46 Let a and 6 be nonzero ideals such that a| 6. Then N(a) | N(6). 


Proof. By hypothesis there exists an ideal ¢ such that b= ac. By Proposition 13.45, 
N(b) = N(a) N(c), which implies the desired conclusion. 7 


342 THE ARITHMETIC OF IDEALS 


The converse of the above corollary is false. The ideals a = (1+ /—6) and b= 
(1—/—6) have the same norm of 15, yet neither divides the other. 

In general, N(a) = 1 if and only if a = (1). In one direction this is trivial since clearly 
N((1))=1. Conversely, suppose N(a) = 1. Then aa = (1), and so a| 1. This, however, 
implies that a> (1) =Z[Vd] and soa=(1). 

One consequence of this observation is that the norm of every ideal is a positive rational 
integer. This will later facilitate proofs by mathematical induction on the norm. 

It is okay to define N((0)) =0, and all the results about ideal norms so far extend to 


zero ideals. That is not the case for the next property. 


Corollary 13.47 If a is a nonzero ideal, then every ideal divisor of a, other than a, has 


norm less than N(a). 


Proof. Let 6 be a factor of a other than a itself, so a= be and c# (1). Since N(a) = 
N(bc) with N(a) # (0) and N(c) > 1, it follows that N(b) < N(a). r 


Theorem 13.36 provides an algorithm for computing the norm of an arbitrary ideal. 
One need simply compute the greatest common divisor in Z of all the listed norms and 


traces. 
For example, let a = (5,2+ 7-6). Then 


aa = (5-5, 5(2— v—6), 5(2+ V—6), (2+ V—6)(2— V-6)) 
= (5-5,5(2— v6), 5(2+ V=6), 10) 

(5) (5,2— v6), 2+ v—6),2) 
( 


since 5 and 2 are relatively prime rational integers. It follows that N(a) = 5. 


Let a= (1+ V-6,1— V-6). Since a =a, it follows that 


aa = aa = (1+ V—6\(1 + V-6)) 
= (1-6+2v=6, VP - y=6", P- v=", 1-6-2V-6) 
= (-5 +2V-6,7,7,-5-2V-6) = (—5 +2v-6,7,-10) = (1). 


It follows that N(a) = 1. 


EXERCISES 343 
Let a= (21—2V-6,5+ V—6). Note that a = (31,5+ V—6) so that 


aa = (31,5+ v—6)(31,5— v—6) 
= (31?,31(5 + 2V—6), 31(5—2V-6), 5? + 1-6) 
= (31) (31,(5+2v—6), 31(5-2v-6), 1) 


= (31). 
Hence, the norm of a= 31. 


Exercises 13.5 


1. Find the norms of the ideals of Exercise 13.4.1. 


13.6 Prime Ideals and Unique Factorization 


An ideal p is said to be a prime ideal if p # (1) and p cannot be written as a product ab 
unless a= (1) or 6=(1). 
Note that in our treatment the zero ideal (0) is not considered to be a prime ideal. 


Here is a criterion for recognizing some prime ideals. 


Proposition 13.48 An ideal whose norm is prime in Z is a prime ideal. 
Proof. Let N(a) be prime. If a = bc, then, by Proposition 13.45, 
p =N(bc) = N(6) N(c). 


Since p isa rational prime, it follows that either N(a) or N(b) is 1. Hence either a or 6 
is (1). 5 


For example, as we saw earlier, in Z[/—6] the ideal (5,2 + V/—6) has norm 5 and so 
this ideal is prime. 

The converse to Proposition 13.48 is false: a prime ideal may have a composite norm. 
In Z[/—22], consider the principal ideal (17) whose norm is 289 = 177. It will be 
shown that this ideal is in fact prime. Assume (17) = ab with a# (1) and b# (1). 
Taking norms we get 289 = N(a) N(b) and so N(a) = 17. Suppose a= (a@,,a,...,a,,). 
Clearly a| (a, ) for each i = 1,2,...,m and so 17| N(a,). We now show that, in fact, 
17 |, for all i =1,2,...,m. Suppose a, = x + y-/—22. Since we are stipulating that 


344 THE ARITHMETIC OF IDEALS 
17 | N(@,) it follows that 
x?+22y?=0 (mod17) or x?=—5y* (mod 17). 


Since —5 is not a quadratic residue modulo 17, it must be the case that y =0 (mod p) 
and hence also x =0 (mod p). This, of course, implies that every such integer of a is 
divisible by 17. Let ¢ be the ideal obtained from a by dividing each integer of a by 17. 
Then a= (17)c. Applying norms to both sides yields 


N(a) = N((17)) N(c) > 177-1 = 289 


which contradicts the fact that N(a) = 17. Hence (17) is a prime ideal of Z[/—22] 


whose norm is the composite number 289. 
Proposition 13.49 _ If an ideal is prime, then its conjugate ideal is also prime. 
Proof. See Exercise 13.6.6. a 


We now set out to prove that just like the rational integers, the ideals of Z[ Vd] have 
the unique factorization property. The proof for the irrational quadratic case is very 
similar to that for the rational integers. Recall that the proof that the rational integers 
have the unique factorization property has three steps: 

(a) show that prime numbers satisfy the property p| a6 => p|aor p| 6; 

(b) show by induction that every positive integer greater than | has a prime factoriza- 
tion; and 

(c) show by induction that the prime factorization is unique, using the first step and 

cancellation to reduce to a smaller case. 


The following proposition is an analog of the first step. 


Proposition 13.50 If p is a prime ideal and p| ab, then p|a or p| b. 


Proof. We will assume that p divides ab but does not divide a. The ideal p+a isa 
common divisor of p and a since it contains both. Since p is a prime ideal, the only 
divisors of p are p and (1). Because it was assumed that p does not divide a, p+a#p, 
since otherwise p+ a =p would imply that a c p meaning that p contains a, or p divides 
a, contradicting the assumption that begins this proof. Therefore p +a = (1), and hence 


l1=x+a for some me p and aea. Hence for any Be b 


B=1-B=xBtaBeptab=p, 


PRIME IDEALS AND UNIQUE FACTORIZATION 345 


showing that b cp. Thus, p| 6. 2 


Corollary 13.51 If p is prime and p|a,a,...a,, then p| a, for some i. 


Proof: By induction on r (see Exercise 13.6.7). a 


We now work out the analog of the second step toward prime factorizations. 


Proposition 13.52 Every nonzero ideal # (1) admits a prime ideal factorization. 


Proof. Mimic the proof of Theorem 4.9 by using induction on the ideal norm. . 


Proposition 13.53 The prime factorization of any nonzero ideal # (1) is unique up to 


the order of the factors. That is, for any nonzero a # (1), if 


A=P,P2°*'P, =4,92°°°4, 
where the p,’s and qj ’s are prime ideals, then r =s and 
p,=q, forallz=1,2,....1r 


after a suitable reindexing. 


Proof. By induction on the norm of the ideal. A prime ideal, by definition, has a unique 
prime factorization, namely, itself. This verifies the proposition for all prime ideals, 
including those of norm 2. For norm 7 > 3, suppose that the proposition holds for all 


ideals of norm 2, 3,...,2—1. Let a be an ideal of norm 7 with two prime factorizations 


G=P,P.--- Pp, =, q4,°°-q,. (13.54) 


Since this ideal is not prime, we may assume that r > 1 and s > 1. Because p, | a, it 
follows from Corollary 13.51 above that p, | q; for some j = 1,2,...,5. The commuta- 
tivity of ideal multiplication (Proposition 13.17) makes it possible to relabel the ideals 
qj if necessary, so that p, | q,. Since all nonzero ideals are cancelable (Corollary 13.38), 


it follows that 


PoP3°°°P, = 97493 °°°4,- 


As the left-hand side has a norm that is smaller than 7, the induction hypothesis is 
applicable. Hence r—1=5—1, or r =s, and the ideals of the right-hand side can be 
relabeled so that p, =q, for all i =1,2,...,r. 7 


346 THE ARITHMETIC OF IDEALS 


Theorem 13.55 The integers of Z| vd] have the unique factorization property if and 
only if every ideal of Z[ Vd] is principal. 


Proof, First, suppose the integers of Z[ Vd] have the unique factorization property. 

Step 1: For every irreducible x ¢ Z[ Vd], the principal ideal (7) is prime. Let a be 
an ideal that divides (7), so that a> (7). It suffices to show that a is either (1) or 
(x). Suppose a # (7) so that there is an element @ € a such that a ¢ (7). By definition, 
there exists an ideal b such that x = ab. Clearly b | x and for every BEb aBeab=(z), 
and so 2 | ap. 

By the unique factorization of integers is an irreducible factor of either a or B. Since 
nm does not divide a, it must be that z | B, and so 6 C (7) since B is an arbitrary element 
of b. 

Step 2: Every prime ideal in Z[ Wd] is principal. Let p be a prime ideal. Then 
p | (2) for some nonzero rational integer 2 (one can always use a = N(p)). Factor a into 
irreducibles in Z[Vd], say a=7,7,,....7,. Then (a2) =(x,)(x,)++-(2,) and so p 
divides some (7,). Since this (7, ) is prime by Step 1, it follows that p = (7; ). 

Step 3: Every ideal in Z[ Vd] is principal. The zero ideal is clearly principal. Every 
nonzero ideal is a product of prime ideals which are principal by Step 2. Since the product 
of principal ideals is principal (Proposition 13.22), p is also principal. This concludes the 
only if (=>) part of the proof. 

To prove the converse assume that every ideal in Z[ Vd] is principal. The existence of 
a factorization into irreducibles is guaranteed by Proposition 13.50. To prove uniqueness 
it suffices to show that for any irreducible 2, 2 | a implies that either 7 | a@ or 7 |. 
(Recall that the analog of this property for prime ideals in the proof of Proposition 13.50 
was used to prove the uniqueness of prime factorizations.) 

So suppose 7 | «8 and x does not divide a. The only factors of z are units and unit 
multiples of 2, so the only common divisors of 2 and o are units. The ideal (7, a) is 
principal by hypothesis, say (z,a) = (8). It follows that 6 is a unit, so (z,@) = (1). 
Thus, there exist integers x, y ¢ Z[Vd] such that xx + ay = 1. Multiplying through by 
B we get 


nBx +aBy =f. 


Since z | eB, it follows that x | p. ] 


EXERCISES 347 


Exercises 13.6 


1. Explain why units do not appear in either the statement or the proof of Proposi- 

tion 13.53. 
2. Is the ideal (5) prime in Z[—5]? 
3- Is the ideal (5) prime in Z[—6]? 
4. Is the ideal (6) prime in Z[—5]? 
5- Is the ideal (6) prime in Z[—6]? 
6. Prove Proposition 13.49. 


7- Prove Corollary 13.51. 


13.7 Constructing Prime Ideals 


The prime ideals of Z[ Vd] will now be classified in a scheme similar to that used for the 


Gaussian integers (Theorem 12.48). 


Theorem 13.56 Every prime ideal in Z[ Vd] divides a unique prime rational integer. 
That is, if p is prime, then p | ( p) for exactly one prime p of Z*. 


Proof. The ideal pp = (N(p)) is divisible by p and has a generator in Z*. Since p # (1) 
N(p)> 1. 


Factor N(p) into primes in Z*, say 


N(P) = 1 P2°** Py 


Then 


PP =(P1P2--P,) = (Pi) (Pr) (2, )» 


so p divides some a, by Corollary 13.51. 
For the uniqueness, assume that p | ( p) and p | (q) for two different prime rational 
integers. Since p and gq are relatively prime, it follows from Proposition 13.20 that 


p= (1), a contradiction. 


Corollary 13.57 Every prime ideal of Z[ Vd] has norm p or p* for some rational 
prime p. 


348 THE ARITHMETIC OF IDEALS 


Proof. Let p be a prime ideal in Z[vd ]. Then there is a rational prime integer p such 
that p| (p). Taking ideal norms, we get N(p) | N(( p)). Since 


N((p)) =| N((2))| = 2? 
it follows that N(p) is p or p?. . 


The significance of Theorem 13.56 is that it facilitates the discovery of all the prime 
ideals of Z[ Vd]. 
The following theorem describes how each prime number (really the principal ideal 


generated by each prime number) factors in Z[ Vd] and thus shows us what all the prime 
ideals of Z[ Vd] look like. 


Theorem 13.58 For each (rational) prime number p: 
(a) If d is a quadratic nonresidue of p, then ( p) is prime in VARA with norm Phe 
(b) If @ is a quadratic residue modulo p with distinct square roots ¢ and c’, then 
(p) =pp where p has norm p, and p= ( p, Vd —c). 
(c) If d is a quadratic residue modulo p with a double square root c, then { p) =p? 


where p has norm p, and p= (p,vd—c). 


Proof: Proof of (a): We show that if p is a rational prime and d is a quadratic nonresidue 
of p, then ( p) isa prime ideal. Suppose not. Then there exists a nontrivial factorization 
(p) =ab. A standard norm argument leads to the conclusion that N(a) = p. Thus, p is 
a prime ideal. Since that (p) Ca and ( p) #4, it follows that there exists an element 
a+bvd such that 

at+bVdea, at+bvdd(p). 


Since a| (a+ 6Vd), it follows by taking norms that p | N(a+ 6vVd), or 
a*—db’=0 (mod p). 


If 6 =0 (mod p), then necessarily also 2 = 0 (mod p), which contradicts the fact that 
a+bVd isnot in (p). Hence 6 £0 (mod p). It follows that when the equation above 
is divided by 6 we get 

(ab) =d_ (mod p), 


contradicting the assumption that d is a quadratic nonresidue modulo p. In other words 


if d is a quadratic nonresidue modulo p, then (p) is a prime ideal. 


CONSTRUCTING PRIME IDEALS 349 


Proof of (b): Next suppose p is a rational prime and d is a quadratic residue modulo 


p with distinct roots ¢ and c’. Set 
p= ( iz Vd - c) : 


Since pep, p|(p). Trivially, Wd —c ep. By Proposition 13.4, Vd —c ¢ (p) and 
hence p # ( p). Therefore, N(p) properly divides p” and so N(p) is 1 or p. Clearly, 


B=(p,-Vd—c) 
and hence 


pp =(p, Vd—c)(p,—vd -c) 
= (2, p(Vd—c), p(-Wd -e),?—d). 


Since c?—d =0 (mod p), it follows that 
pb =(p)(p, Vd —c,-Vd ~0,(c?—d)/p). 


Since both the ideals pp and ( p) have norm p? it follows that the norm of ( p, Vd — 
c,-Vd —c,(c?—d)/p) is 1 and hence this is the unit ideal so that pp = ( p). 
Note that N(p) = p, where p is prime, implies that p is a prime ideal. It remains to 


show that p # p. Assume otherwise, so that 


p=(p.Vd-e)(p.vd~) 
(2°, (Vd —e), p(Vd—c), (Vd —c)?) 
= (9? o(Vd ~0),(vd-e)) 
(13, p(Vd-0.VP 200d +2, 


In order for this to equal ( p) the third generator needs to be divisible by p. However, 


we already know that 


ct4+Vd =c2+d=0 (mod p). 


350 THE ARITHMETIC OF IDEALS 
Hence p | 2c vd. Similarly, p | 2e'Vd and hence it follows that 
c=c’ (mod p), 


which contradicts the assumption that ¢ and c’ are distinct modulo p. 

Proof of (c): In the remaining case d is a quadratic residue modulo p with a double 
square root ¢. 

Set p=(p, Vd). Arguing as was done above, N(p) is p and hence p is a prime ideal. 


Next we compute 


y=(p.fd-e)(p.Vd-e) 
(p?, (Vd —c), (Vd —c)’) 
7 (p?, (Vd —c),d -20V'd + ay 


since Vd =. 
However, c? +d =0 (mod p) and c =0 (mod p) and hence both c* +d and 2eVd 
are divisible by p. Consequently ( p) can be factored from the four generators of p? 


above, resulting in 

p> = (p)(p.(Vd — 0), (-2eVd +c? + d)/p). 
Taking norms on both sides shows that the last ideal on the right has norm 1 and hence 
is (1). ] 


We demonstrate several applications of this theorem: 

Is (3) a prime ideal in Z[v-1 ]? Since —1 is a quadratic nonresidue modulo 3, it 
follows from Theorem 13.58 that (3) is a prime ideal in Z[v—1]. 

Is (5) a prime ideal in Z[v-1]? Since —1 is a quadratic residue modulo 5, in fact 


—1 =(+2), and we chose the value 2 for c, then 
(5) = (5,-24 v=1)(5,-2— v=1) 


is the prime factorization of the ideal (5) (see Exercise 13.7.6). 


CONSTRUCTING PRIME IDEALS 351 
Is (5) a prime ideal in Z[V—6]? Since 
—6=2?=3* (mod 5) 
the ideal (5) has the prime ideal factorization 


(5, ¥—6-2){5,-V-6-2. 


Is (7) a prime ideal in Z[V—6]? Here 17 =6? =1 (mod 7) and hence (7) factors 
into 


(7, V-6-1)(7,-v-6-1). 


Is (11) a prime ideal in Z[ V—14]? As —14 is a quadratic nonresidue modulo 11, it 
follows that (11) is a prime ideal of Z[—14]. 

Is (7) a prime ideal in Z[V—14]? Zero is a quadratic nonresidue of any prime 
modulus. Consequently (7) is a prime. 

Is (2) a prime ideal in Z[V—1]? Since —1 is a quadratic square modulo 2 with a 


double square root, it follows from part 3 of Theorem 13.58 that 
(2) =(14 v=1)? 


where (1+ /—1) is a prime ideal of Z[V—1]. 

Listed in Tables 13.1 and 13.2 are summaries of the calculations leading to “small” 
prime ideals of Z[Vd] for d =—5 and —6. The entry in the column marked p can be 
any rational prime integer. 

For example, to factor the principal ideal a = (2 + V—6) into prime ideals, notice first 
that the norm of (2+ /—6) is 27 ~(—G) = 10. Hence the norms of the prime factors of 
a are 2 and 5, and consequently 2 + ¥/—G has two factors: one from the pair {p,,P, } 
and one from the pair {p,, p, }. However, since { p, =p, }, it follows that p, is a prime 
factor of 2+ V/—6. On the other hand, p,e(2+ V-6, 5) and hence p, is the other 
prime factor of (2+ /—6). Thus 


(2+ V=6) = pp 


is the required prime factorization. 


352 THE ARITHMETIC OF IDEALS 


p 5 =e? (mod p) P, (Pp) N&p,) 
2  -5=1=c?(mod2) (1+ ¥—5,2) Pop, 2 
3. -5=1=c?(mod3) = (1+. ¥—5,3) p3P; 3 
5 —5=0=c*(mod5)  (vV—5,5)=(v—5) pe 5 

7 -5=2=c?(mod7) (3+¥7-—5,7) pop, 7 
11 -5=6=c?(mod11) (11) Loy ad 


(11) 
13. —5=8=c? (mod 13) (13) (13) 
17 —5=12=c* (mod17) (17) (17) 17? 
19 —5=14=c? (mod 19) (19) (19) 
23 —5=18=c* (mod 23) (23) (23) 


Table 13.1 Some calculations in Z[Vd] 


p  —6=c? (mod p) P, (p) N(p,) 
2 -G=0=x?(mod2) (v¥—6,2) p5 2 
3. -6=0=x?(mod3) — (v—6,3) p5 s 
5 -6=4=x?(mod5)  (2+V-65) pop, 5 
7 -6=1=x?(mod7)  (1+V-67) pp, 7 


11 -6=5=x?(mod11) (4+~7-611) pp, 11 


13. -6=7=x? (mod 13) (13) (13) 13? 
17. -6=11=x? (mod 17) (17) (17) 177 
19 —G=13=x* (mod 19) (19) (19) 19? 
23 —6=17=x* (mod 23) (23) (23) 23? 


Table 13.2 More calculations in Z[ Vd] 


EXERCISES 353 


To factor the principal ideal (4+ 3-V—6) into prime ideals, the norm of 4+ 3-6 is 
4? — 3?(—6) = 2-5-7. Hence the norms of the prime factors of 4+ 3-/—6 have norms 
2, 5, and 7 and consequently this ideal has as prime factors p,, one of the pair { p<, p, }, 
and one of the pair {p,, p- }. 

‘The ideal p, is a factor of 4+ 3/—6 if and only if p, divides (4+ 3/—6 ) and this 
happens if and only if 


443V—6<(2+ V-6,5). 


This is done by the method of example; namely, we look for integers « = a + 6V—6 and 
B=ct+dv—-6 such that 4+ 3V—6 = a(2+ V—6) + B(5), or 


44+3V—6 = (a+ bV—6)(2+ V—6)+(c +d V—6)(5). 


Equating real and purely imaginary parts we get 4 = 2a—66+5c¢ and 3=a+26+5d. 
One way to find integer solutions of these two equations is to rewrite them as 5c = 
4—2a+6b6 and 5d =—a—2b —3, which have the solution a=0, 6=1,c=2,d=-1. 
The existence of this solution guarantees that p, is a prime ideal of (4+ 3/6). 
Finally, we turn to p,. It divides (4+ 3-V/—6) if and only if 4+ 3V—6¢ (1+ V/-6,7). 
‘The same substitution that was used above leads to the system of equations 4= a—66 +7c 
and 3=a+6+7d. When these two equations are subtracted from each other, we 
obtain 1 = —76 + 7¢ —7d, which contradicts the requirement that 6, c, and d are 
rational integers. Hence, p, does not divide (1+ /—6,7), so that p>; does divide it. 


Consequently 


(4 + 3v-6) = P»PP>. 


Exercises 13.7 


1. Construct the analog of Table 13.1 for d =—10. 
2. Construct the analog of Table 13.1 for d =~13. 
3. Construct the analog of Table 13.1 for d =—14. 
4. Prove that every ideal divides some rational integer. 


5- Let ¥ bean ideal and suppose a and f are rational primes in .%. Prove that « and 


B are associates of each other. 


6. Prove that (5) = (5,-2+ ¥—1)(5,-2— /—1) in Z[V—I]. 


354 THE ARITHMETIC OF IDEALS 


Chapter Summary 
We constructed a large number of domains whose ideals do provide unique factorizations 
into primes. 


Chapter Review Exercises 


Mark the following true or false. 
1. Every nonzero nonunit in Z[ Vd] is a product of irreducibles in Z[Vd]. 
2. Ideals are real. 


3- The only nonprincipal ideal of Z[/—5] is (3, 1+ V/—5). 


New Terms 
associates, 320 norm, 318 
cancelable ideal, 337 prime ideal, 3.43 


conjugate, 318 principal ideal, 323 


conjugate ideal, 331 4 3 
roduct, 32 
greatest common divisor, 340 e : 


ideal multiplication, 335 quadratic field, 318 
ideals, 322 trace, 318 


irreducible, 320 unit, 320 


Chapter 14 
” 


ABSTRACT RINGS 


HE NOTION of a ring was introduced by David Hilbert circa 1896 in order to provide 
axiomatic descriptions of such mathematical structures as R, Q, Z, Z,, Z[|vVd ], 
F[X], and many others. They all possess two mathematical operations that behave very 
much like the standard, school-taught arithmetic. It therefore made sense to isolate them 


and identify them by an appropriate nomenclature. 
14.1 Rings 


A (commutative) ring (with unity) is a set R with two binary operations, usually denoted 
by + and -, and two distinct special elements, usually denoted by 0 and 1, for which the 


following hold: For any elements a, 6, c of R 


at+beR a-beR (closure) 
(a+ b)+c=at(bt+c) (a-6)-c=a-(b-c) (associativity) 
atb=b+a a-b=b-a (commutativity) 


a-(b+c)=a-b+a-c (distributivity) 
there exist distinct elements 0 and 1 in R such that 

a+O=a a-l=a (identities) 
and there exists an element —a € F such that 

a+(-a)=0 (additive inverse). 


Note that these properties are nearly identical with those of a field, except for the issue 


of the existence of multiplicative inverses. The nonzero elements of rings are not required 


Introductory Modern Algebra, Second Edition. 355 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


356 ABSTRACT RINGS 


to possess multiplicative inverses. Consequently, every field is a ring but rings need not 
be fields. The integers Z do constitute a ring, as does Z, for n = 2,3,4,.... Because of 
the stipulation that 0 # 1 a ring must have at least two elements and hence Z, and Z, 
are not rings. 

The reader is already familiar with a variety of rings including Galois fields, as well 
as the fields Q, R, and C. If F is any field, such as those just listed, then F[X] is a 
polynomial ring over F, as is F[X, Y]. The Gaussian integers constitute a ring, as does 


each set of the form 


Z[Vd|={at+bvd|abeZz} 


where d is a negative square-free integer. These rings are called quadratic domains. Some 
were discussed in the previous chapter, and others will be examined closely in this one. 
Their elements are also called integers and in order to distinguish between them and 
the classical integers 0,+1,+2,+3,... these latter will be referred to as rational integers 
as opposed to the complex integers, or irrational integers that constitute the quadratic 
domains. 

The element a of the ring R is said to be a unit of R if there exists an element b<€ R 
such that a6 = 1. The sets of units of the rings we encountered before have been small 
commutative groups, but that is no longer the case. While the units always form a 
commutative group, their number may be infinite. 


We now go on to derive some of the elementary properties of rings. 


Proposition 14.1 If a is an element of the ring R, then 2-0 =0. 


Proof. First, 
a-0=a-(0+0)=a-0+4-0. 
If we now add —(a - 0) to both sides of the equation above, then we haveO=a-0. 
Proposition 14.2 For any elements a and 6 of the ring R, 
(a) (-1)-a=—-a; 
(b) (-a).6 =a-(—b)=—-(a- 5); 
(c) every element of R has at most one additive inverse; 


(d) every element of R has at most one multiplicative inverse. 


Proof. See Exercises 14.1.4 to 14.1.7. a 


In any product a - 6 it is customary to omit the dot and simply write 26. An element 


a of a ring is said to be irreducible if it is a nonunit and in any factorization a = bc, either 


EXERCISES 357 


6 or ¢ must be a unit. This seems to be a reasonable enough extension of the notion of 
primality from the integer ring Z to arbitrary rings, but, for reasons that probably have to 
do with 20/20 hindsight, Kummer felt that it was another definition that caught the true 
essence of primality (see Section 12.6). This stands in marked contrast to the definition 
used by all mathematicians from Euclid circa 300 BCE to Gauss only 10 years earlier, that 
a (positive) prime number is one that is only divisible by 1 and itself. An element p of a 
ring is prime if whenever p | ab, then either p| a or p | &. It follows from Lemma 4.6 
that the primes of the ring of integers Z are also prime in this sense. Exercise 6.3.15 
demonstrates this for the polynomial ring F [x] over any field F. Generally speaking, 
the notions of irreducibility and primeness are, to some extent, independent of each 
other. The element 2 in Z, is prime because every even entry in the multiplication table 
of Table 4.3 must belong to either an even-numbered column or an even-numbered 
row. On the other hand, 2 is not irreducible because 2 = 2 x 4 (mod 6). The converse 


situation cannot happen (see Exercise 14.1.1). 


Exercises 14.1 


1. Prove that the rational integer x is prime if and only if it is irreducible. 

2. Let 2>1 and k > 1 be positive rational integers with the property P(n, &): For 
every a,b €Z, if n| a* b*, then either n|a* or n| 6°. 
(a) Prove that if P(z, 1) is true, then there exists a rational prime p such that 
né{l, p}. 
(b) Prove that if P(x, 2) is true, then there exists a rational prime p such that 
ne{l,p, p*}. 
(c) Prove that if P(x, 1) is true, then there exists a rational prime p such that 
ne{l,p, p*,...}. 

3. The positive rational integer 7 is divisible by a square if and only if there exists a 
rational integer m such that n | m and n| m?. 

4. In the proof of Proposition 14.2, prove that (—1)-a=—a 

5- In the proof of Proposition 14.2, prove that (—a)- 6 =a-(—b)=—{a- 6). 

6. In the proof of Proposition 14.2, prove that every element of R has at most one 


additive inverse. 


7- In the proof of Proposition 14.2, prove that every element of R has at most one 


multiplicative inverse. 


358 ABSTRACT RINGS 


For each of the Exercises 14.1.8 to 14.1.20 specify a set R and two binary operations on 
R. Decide which of these determine a ring and which do not. Identify the units. Justify 
your answers. 

8. Z,,+,° 10. Z,,+,° 12. R,+t,- 

9-  Z3,4,° 1. Zo t,° 

13. R, subtraction, multiplication 

14. R?, vector addition, dot product 

15. R°, vector addition, cross product 

16. All the subsets of Z, union, intersection 

17. All the subsets of Z, union, A, where A(A, B) = (AN B°)U(BN A‘) 

18. Z,—- 

19. {f:R-R},+,- 

20. {f:R—R|/f is continuous everywhere }, +, - 


14.2 Ideals 


An ideal of the ring R is a subset J of R such that for any a,b¢ J andreR 
atbel and rael. 


Two ideals are said to be equal if they are equal as sets. The prototypical example for an 
ideal is the set of even numbers in the rational integers Z. More generally, if @ is any 


element of the ring Z, then 
(a) = { ra | r eZ} = {0,+a,+2a,43a,...}. 
More generally yet, if the ring Z is replaced by an arbitrary ring R, then 


(a)={ra|reR} 


is an ideal of R. Ideals of the type displayed above are said to be generated by a and are 
called principal ideals. In particular, (1) = R which is referred to as the unit ideal. This 


observation generalizes to the next proposition. 


Proposition 14.3 If a is an element of the ring R, then (a) = R if and only if a isa 


unit of R. 


IDEALS 359 


Proof. Suppose first that (a) = R. It follows that there is an element b € R such that 
ab =1. This means that a is invertible and hence it is a unit. 


Conversely, if a is a unit of R, it has an inverse a~'. So if r is any element of R, then 


r=(aa')r =a(a'r)e€ (a). Thus, (2) =R. 5 


Let S = {4,,4,,...,4, } bea finite subset of the ring R. The ideal generated by S is 
the following subset of R: 


{re 1a, + a, +-+7,4, Mp ty. t, eR} 


where 7 is any positive integer. This is denoted as 


(4,4y,...,4,). 


For example, { 1, /—1} generates Z[i]. 

In Z, (2,3) =(1), (4,6) = (2) and, in general, (a, 6) = ( g) where g is the greatest 
common divisor (a, 6). (See Exercise 14.2.6.) 

An ideal A is said to be divisible by an ideal B if there exists an ideal C such that 
A= BC. The unit ideal (1) clearly consists of the whole ring R and it follows from the 
equation J - (1) =J that every ideal in R is divisible by (1). It will prove convenient to 


know that the unit ideal of a ring is characterized by this property: 


Proposition 14.4 If the ideal A is divisible by the ideal C, then ACC. 
Proof. Suppose A= CD where C = (¢,,c,) and D = (d,,d,) so that 
A=CD= (¢, d,,¢,d,,¢,d>, c,d,). 


By the defining properties of ideals each of the four products above belongs to C and so 
ACC, . 


An ideal that is different from R and is only divisible by the unit ideal R and itself is 
said to be an irreducible ideal. An ideal that is not irreducible is composite. 


We now show that the apparent paradox of the equation 


2-3=(14+ ¥—5)(1— v—5) (14.5) 


360 ABSTRACT RINGS 


can be resolved by replacing each of the four integers of Equation 14.5 with the principal 
ideal it generates and viewing them as elements of the set of all ideals of Z[ V¥—5]. In 


that context (see Exercise 14.2.6) 


2) = (2,14 ¥-5)?, (14.6) 
ae v5), (14.7) 
ee ee eee (14.8) 
(1- v=) = (2,14 V-5)(3,1— v5). (14.9) 


Note that the factors of the right-hand sides of the above equations are, by Exercise 14.2.6, 
all prime ideals. When the four integers in Equations 14.6 to 14.9 are replaced by these 


respective right-hand sides we obtain the following: 
(6) = (2) (3) =(2,1+ V=5)?(3, 14 V=3)(3,1-v=3), 
and 


(6) =(1+ ¥-5)(1~v=5) 
(2,14 v=5) (3,14 ¥=5)(2,1— v=5){3,1- v5) 


= (2,14 ¥=5)?(3,1+ ¥=5) (3,1- v5). 


In other words, the distinct factorizations of the integer 6 above lead to one and the same 


factorization of the ideal (6) into prime ideals. 


Exercises 14.2 


1. Show that 
(a) (3,1+2V¥—5)(3,1-2V-5) = (3) 
(b) (7,1-2V-5){7,1+2V7—5) =(7) 
(c) (3,14+2V—5)(7,1+2V-5) =(1+2v-5) 
(d) (3,1-2V-5)(7, 1-2-5) = (1-2-5) 
2. Prove that the product of two ideals is well defined. In other words, show that if 
A= A’ and B=B’, then AB= A’B’. 


DOMAINS 361 


3. Find an ideal X of Z such that 
(a) (6)X = (18) 
(b) (18)X = (6) 
4. Find an ideal X of Z[{/—1] such that 
(a) (2+i)X =(7-6i) 
(b) (7-6i)X =2+i 
5- Let R be a quadratic domain and J an ideal that divides every ideal of R. Prove 
that J=R. 


6. Prove Equations 14.6 to 14.9. 


14.3. Domains 


Now that we know that some rings do not have the property of unique factorization, it 
should be of interest to devise necessary and sufficient conditions that will easily (or maybe 
not so easily) determine whether a given ring has unique factorization. Unfortunately, 
this is not possible yet and we have to make do with conditions that are necessary or 
sufficient. In the last section we demonstrated that certain rings (Z[/—3], Z[V—5], 
Z[/—7]) allowed for distinct factorizations. Now we give some conditions that guarantee 
uniqueness. Some of the results of this section will make use of ideals. 

Consider the multiplication table of Z, (Table 4.3). Observe that for every positive 
integer 7, 


”=3 (mod 6). 


The element 3 is not invertible and hence it cannot be considered to be a unit. On the 
other hand, if 3 were a prime, then this would show that the numbers of Z, do not have 
unique factorization. Nor does 3 behave like a composite number, since it cannot be 


expressed as the product of two smaller numbers. The root of the problem is that 
2-3=0 (mod 6), 


or, in other words, the product of two nonzero numbers can be zero. Such a troublesome 
nonzero number whose multiples do contain a 0 is called a zero divisor. A ring that is 


free of zero divisors is an integral domain. 


362 ABSTRACT RINGS 


Most of the rings we have encountered were integral domains. These include 
« the integers Z; 
« all fields, including Q, R, C, and Z,» where p is a rational prime; 
« the polynomial ring F(X) where F is an arbitrary field; and 
« Z[d]. 
On the other hand, if 7 is any composite rational integer, then Z, has zero divisors, as 
does the ring of continuous real-valued functions F : [0,1] — R. 
Let D be an integral domain. A Euclidean function E : D — Z has the following 
properties: 
(a) If p #0 and g £0 are ring elements, then 0 < E(p) < E(pq). 
(b) If p £0 and g £0 are ring elements such that E(q) < E(p), then there exist two 
other elements d and r such that p=dq+r and either r =0 or E(r) < E(q). 
An integral domain is said to be a Euclidean domain provided it has a Euclidean function. 
It is easy to see that Z is a Euclidean domain. We need merely define E(a) = a* and 
recognize that given the p and the qg of requirement (b) above, we let d be the quotient 
and r be the remainder when p is divided by g. Similarly, the ring of polynomials R[x] 
is also a Euclidean domain. This time we set E(p) = deg( p). Once again properties (a) 
and (b) are easily seen to be satisfied where g and r are, respectively, the quotient and 
the remainder when p is (long) divided by q. 


Contrariwise, 3 ¢ Z[/—5] is not prime because 
(2+ /—5)\(2— /—5) =3-3 


and yet 3 divides neither 2+ v¥—5 nor 2— ¥—5 (why not?). On the other hand, a norm 
argument demonstrates that 3 is irreducible. 

Notwithstanding these examples there is a relationship between these two notions that 
will be explored below. 

We now set out to prove that every Euclidean domain has the unique factorization 
property. The proof is broken down into five parts whose relationships are depicted in 
the chart below. Each item of this list will be shown to be justified by the previous ones, 


resulting in the conclusion that every Euclidean domain has the unique factorization 


property. 


DOMAINS 363 


Euclidian domain 


| 


(4, 6) =aa,+ bb, 


| 


principal ideal domain 


| 


irreducibles are prime 


| 


ascending chain condition for principal ideals 


| 


unique factorization domain 


Proposition 14.10 Let a and 6 be two elements of a Euclidean domain R. Then a 
and 6 have a greatest common divisor (a, 6) which is expressible as a combination of a 


and 6. That is, there exist two ring elements a, and 6, such that (a, 6)=a,a+ 6,6. 


Proof. The proof is essentially the same as that used in the proofs of Propositions 4.1 
and 6.15 and the details are omitted. One need merely create a chain like those in 
Figure 6.1 and note that the eventual termination of a such chain is guaranteed by the 
fact that the non-negative integer-valued norm of the remainder gets smaller with each 


application of the Euclidean algorithm. s 


A ring is a principal ideal domain (pp) if it has no zero divisors and all of its ideals are 


principal. 
Proposition 14.11 Every Euclidean domain is a principal ideal domain. 


Sketch of proof. Let I be an ideal of the ring R, let E be the Euclidean function of R, and 
let M = min{ E(a)|a¢/,a £0}. Let g €J bean element such that E(q)=_M. We will 
show, by contradiction, that J = (q). Since g € J it follows that J 2 (q). The reverse 
inclusion is proved by contradiction. Suppose J # (q); then there exists an element 
pel such that p ¢ (q). By the definition of Euclidean domain, there exist d, r¢ R 
such that p=dq+r or r= p—dq where E(r)< E(q)=M. Since q is in J so is 


364 ABSTRACT RINGS 


dq. Since p€I, so is r= p—qd. Thus r is an element of J such that E(r) <M. 
Of necessity r =0 so that p = dq which of course means that p € (q¢), leading to the 


desired contradiction. Hence J = (q). ] 


Proposition 14.12 In a Euclidean (principal ideal) domain irreducible elements are 


primes. 


Proof. Let a be an irreducible element of the prop D and suppose a | bc. We must show 
that either a| 5 or a|c. 

Consider the ideal J = {ax + by |x, y¢D}. By the previous proposition J is principal 
and so we may assume that J = (d). Since a€/, we can write a= dr for some r in 
D. Because a is irreducible either d or r is a unit. 

If d is a unit, then J = D, and hence there exist x and y such that 1 =ax + by, or 
c =acx + bcy. Since a divides both summands of the right-hand side, it follows that 
alc. 

On the other hand, if 7 is a unit, then (a) =(d)=J. Because 6 € /, there is an 


element ¢ € D such that at = 6. Hence, a| 6. . 


A sequence J, of ideals is said to be an ascending chain of ideals if it satisfies the 
inclusions 


I,cl,,, forall k=1,2,3,.... 


For example, the sequence J, 7 = 1,2, 3,..., where 


(3) 


is an ascending chain. So is the sequence of polynomial rings Z[x,, x,,...,,]. 


Lemma 14.13 If {/,}, 2 =1,2,3,..., is an ascending sequence of ideals of some 


domain D, then 


is also an ideal of D. 


Proof. Let a €J,.. Then there exists an index, call it 7,, such that a¢ 7, . Note that 
aél, forall n > 2, because the given sequence is increasing. For any Be I,,, «% Be tae 
and hence 


at+B,oBel cl. 
a+p 


oo" 


DOMAINS 365 


Thus, the set J, is closed with respect to addition and multiplication. Similar arguments 
can be used to prove that this set is indeed an ideal of D. 2 
Proposition 14.14 Let /,C/4,C4,¢---C1,€-- be an increasing sequence of prin- 
cipal ideals of the Euclidean domain D. Then there exists an integer & such that J, = /, 
for all n> k. 

Proof, By Lemma 14.13 it is known that 


is also an ideal. By Proposition 14.11, J,, is also principal and hence J,, = (q) for some 
q < D. By the definition of the union operation, q € J, for some k = k,. Because the 


given sequence of ideals is increasing, it follows that 
qel, forall n> &,. 


Thus, for all n> & 


and hence n> k, implies that /,, = J, . 2 
q 


A prime factorization of an element a of a ring is an equation 


a=ep;' psp, (14.15) 


where ¢ isa unit of R, and for each i = 1, 2,3,..., 4, the integer p, is a prime element of 
R, and r, is a nonnegative rational integer. A ring is said to have the unique factorization 
property if given two prime factorizations of associate elements such as Equation 14.15 


and 


a’ =6q;'q;---q; (14.16) 


necessarily # = & and there is a relabeling of the g,’s such that p, and q, are associates 
for each i = 1,2,3,...,4=. 


Proposition 14.17 Every Euclidean domain has the unique factorization property. 


Proof. We first show, by contradiction, that every element a € D has a factorization into 
irreducible elements. Suppose that is not the case for the element a). Then a, cannot be 


irreducible for if it were, then a) would be a factorization into irreducibles for itself. 


366 ABSTRACT RINGS 


Since a) is decomposable, there exist nonunits a, and 6, such that a =4,6,. By 
assumption, either a, or 6, is decomposable; relabel if necessary so that 2, is reducible. 
Hence there exist nonunits 2, and 6, such that 2, = 4,6). Once again, the hypothesis 
implies that at least one of a, and 4, is reducible. Relabel, if necessary, so that a, is re- 
ducible. Continuing indefinitely we produce a sequence of distinct nonunits 4), 4,,45,... 


such that a,,, | 4,. It follows that we have an increasing sequence of ideals 


(a,) ¢ (a,) ¢ (a,) ¢ (a,) $+ 


which clearly does not stabilize. This contradicts the previous proposition and so there 
must exist a factorization of a, into irreducible elements. By Proposition 14.12 this is also 
a prime factorization of 4). To prove that Euclidean domains have unique factorization, 


the induction process of Theorem 4.9 must be modified. For any factorization 


; = PIG TOO” cas gt b, 
9: d=Ep, py P, 


we define the length of the factorization — to be A(y) = 7, + 7, +--+ + 7,. We proceed to 
prove the uniqueness by induction on A. Suppose the element a has a factorization of 
length zero. Then a is a unit and every factorization of a has length zero and the unique 
factorization holds. Let 7 be a positive rational integer and suppose that the unique 
factorization holds for all the elements of R that have a factorization of length less than 


n. Let a bea ring element with a factorization 


p: a=Sp)' py p, 


of length . Let 
ps a=eq\ gy gy 


be another factorization of a. Since p, | a it follows from the definition of primality 
that p, | g; for some i € {1,2,...,4}. Because D is a Euclidean domain both p, and 
q, are irreducibles and hence there exists a unit 7 so that p, = q,, or, after relabeling, 
P, =79,. Let gy and ¢' be the factorizations obtained when ¢ and @ are divided by p, 
and 7q,, respectively. Then 

ae 


nol o+. r, 
Qe: — =6p,' Pep, 
Py 


EXERCISES 367 


and 


Since A(g’) = A(p) — 1 < n, the induction hypothesis (that a has unique factorization) 
applies to a/p, so that h=k, p, and gq, are associates, and r, = 5, forall i=1,2,...,4= 
k. Hence and ¢ are the same factorizations. This completes the induction and the 


proof. . 
Corollary 14.18 The Gaussian and Eulerian numbers have the unique factorization 


property. 
Exercises 14.3 


1. Prove that every finite integral domain is a field. 


14.4 Quotients of Rings 


By definition, every ring R is a commutative group with respect to its “+” operation, and 
every ideal J of R constitutes a subgroup of R with respect to this addition operation. 
As we saw in Chapter 10, this set of circumstances yields a quotient structure R/J 
whose elements are cosets a+ J. Since the addition on R is commutative, we needn’t 
be concerned about right cosets versus left cosets—they are the same. As noted in 


Theorem 10.2, the addition of cosets, defined by 
(at 1)+(64+/)=(a+6)4+1, 


defines a group R/J whose elements are the cosets of J and whose binary operation is 
defined above. With minor changes the statement of the Law of Homomorphisms holds 


in the context of rings as well. 


Theorem 14.19 (The Law of Homomorphisms) Let f : G > Q bea surjective ring 
homomorphism. Then Kerf is a subring of G such that G/Kerf = Q. 


For example, if R= Z, and J = {0,3}, then the quotient ring has as its elements the 


cosets {0,3}, {1,4}, and {2,5}. This observation may be restated as 
Z,/ {0,3} =Z,. 


For example, let R = Z[i] and J =(1+i). We will demonstrate that the quotient 
group R/T is in fact isomorphic to GF(2, x). 


368 ABSTRACT RINGS 


Figure 14.1 Two interlaced lattices 


+ I 1+/ 


I I 1+J 
1+J | 1+J I 


Figure 14.2 A ring with two elements 


Note that the following statements are logically equivalent: x + yi¢ (1+i) ifand only 
if x + yi=(a+ 6i)(1 +i) has an integer solution for a and 6 if and only if a=(x + y)/2 
and 6 =(y—«x)/2 are integers with the same parities if and only if x = y (mod 2) (see 
Figure 14.1). 


It follows that the coset 


l+l={xtiy|x#y (mod 2) } 


is the only other coset besides J. The addition and multiplication tables of the quotient 
group appear in Figure 14.2 and it is easily verified that this ring is isomorphic to the 
Galois field GF(2, x). 

Another example appears in Tables 10.4 to 10.7 wherein the Galois field GF(2, x” + 
x +1) is displayed as the quotient ring Z,[x]/(x? +x +1). 

Given a ring R and an ideal J € R, the ideal is said to be a maximal ideal if the only 


ideals that contain J are itself and R. 


Theorem 14.20 An ideal 7 of a ring R is maximal if and only if R/J isa field (with 


respect to the quotient operations). 


QUOTIENTS OF RINGS 369 


Proof. Suppose J is a maximal ideal of R and let a be any ring element not in J so that 
a+I#J. It follows from the maximality of J that the ideal (a, /) consists of all the 
elements of R. In particular, there exist elements b ¢ R and ¢ e/ such that ab +¢=1. 
Hence, 


(at+lo(o+l)=abt+l=ab+ce+l=14+1=1,);. 


Conversely, suppose R/J isa field and let J be any ideal of R that properly contains J. 
We now show that necessarily J = R. 

Since J 2 J there exists an element a such that a€ J —J. Consequently a+ J # J = 
Opjy- Since R/T isa field the quotient element a+ J has a multiplicative inverse, say 
b+]. Thus 

ab+IT=(at+lo(6+/)=14+/. 


It follows that 1—ab ¢JI C J Since € J and J is an ideal, it follows that 2b ¢ J. Thus, 
l=(l-—ab)+abeJ+J=/. 


Thus / =R. . 


Lemma 14.21 Let F bea field and p(x) an irreducible polynomial in F[x]. Then 
( p(x)) is a maximal ideal of F(x). 


Proof. Let I be an ideal that properly contains ( p(x)) and let g(x) be another polyno- 
mial in J —( p(x)). Then, because F[x] is a prp, there is a polynomial g(x) ¢ R such 
that J = (q(x)). Since p(x) J it follows that q(x) | p(x). The irreducibility of p(x) 


now implies that g(x) is a unit and hence J = R. . 


Theorem 14.22 Let p €Z bea prime, F be the field Z,> and P(x) be a primitive 
polynomial in F[x]. Then F[x]/(P(x)) ~ GF(p, P(x)). 


Proof. Let a be a primitive element of GF(p, P(x)) and define a function f : F[x] > 
GF(p, P(x)) via f(q(x)) = q(a) where q(x) is an arbitrary polynomial in F[x]. That 


f isa homomorphism of F [x] follows from 


F (g(a) £(4'(a)) = 4()a'(x) = f (4(@)q'(a)). 


Galois’ Primitive Element 7.17 guarantees the surjectivity of f. Thus, f is a homo- 
morphism of F [x] whose kernel is (P(x)). The desired result follows from the Law of 


Homomorphisms. . 


370 ABSTRACT RINGS 


We now turn to examine some quotient rings of the Gaussian integers Z[/—1]. Given 
a group G and a normal subgroup H, a complete residue system is a set S such that every 
coset of H! contains exactly one element of S. For example, the set { 1,4, 6, ¢} is a com- 
plete residue system of the quotient Z,,/H portrayed in Table 10.2 and Figure 10.2, and 
the set {d,e, 6, g } isa complete residue system of the subgroup H of the Quaternions 
(see Table 10.2 and Figure 10.2). The set { 1, 2,3,...,} is such a system for the ideal 
(n) in Z. 

For example, we will find a complete residue system for Z[i]/(2+i). Since N(2+i) = 
5, it follows from the long division process that every coset of (2+i) has an element of 
norm less than 5. Hence the set of elements in Figure 14.3, call ic V, contains a complete 
residue system. The set V is winnowed down to a complete residue system of Figure 14.3 
by the observation that if for some unit e ¢ {+1,+i}, z, = z,+(2+i)e, then z, and 
2, belong to the same coset and so one of them can be deleted. The complete residue 
system clearly contains five elements and their interaction is identical with the two tables 
in Figure 14.4. 

Given two rings R and S anda function f : R— S such that 

(a) f is injective (one-to-one), 

(b) f is onto (surjective), 

(c) f(x+y)= f(x) + f(y) forall x, y eR, 


(d) f(xy) = f(x) f(y) for all x, y eR, 
the rings R and S are said to be isomorphic, and f : R— S isa ring isomorphism of R and 


S. This is, of course, quite similar to the definition of a group isomorphism (Section 9.3) 
and even more so to the definition of an isomorphism of fields (Section 10.3). In 
fact, it could be said that a ring isomorphism is a group isomorphism that also satisfies 
requirement (d) above. 


A prime subfield of the field F is one which does not have any proper subfields. 
Proposition 14.23 Every field has a unique prime subfield. 


Proof, Let F bea field and let F denote the intersection of all the subfields of F. By 
Exercise 14.4.7, F is a field. Moreover, if F C F contained a proper subfield, say G, 
then F 1G =G & F, contradicting the definition of F. Hence F is a prime subfield of 
F.. If His any other prime subfield of F, then the field HM F is a subfield that must 
equal both and hence H = F. 2 


QUOTIENTS OF RINGS 371 


-l+i—i—1+i 


a) 


b) 


Figure 14.3 In search of a ring 


Figure 14.4 Z[i]/(2+i) = Z, 


372 ABSTRACT RINGS 


Proposition 14.24 If F is an arbitrary field, then F contains a subfield that is isomor- 


phic to exactly one of the fields 


Q Z,,Z;,Zs,...,Z,,... 


where p varies over all the rational primes. 


Proof. Let 1, be the multiplicative identity of F and for each positive integer n set 


n-1, equal to the sum of 7 1,.’s. Consider the sequence 
Pq Rolie Wa 2 Sic: POF. 


The terms of this sequence are either all distinct or else there exist coefficients g and 
A such that g +1, =4-1,.. In the first case let f : Q— F be defined as follows. First 


set f(m)=m-1, for me Z and 


f(m/[n)=(m-1,)(n-1py' €F. 


The proofs of the following facts are straightforward (see Exercise 14.4.8): 
(a) The function f is well defined. 
(b) The range of f isin F. 
(c) The function f is injective. 
@ fle+y)=flx)+ Fy) and fey) = fof). 
It follows that f is an isomorphism of Q and there exists a subset B c F such that 


l cCBCF., In the second case observe that (g —4)-1, =0. Set 


G={geZ|g-1,=0}. 


Since G is an ideal of Z, it must be principal and hence there exists a positive integer p 
such that G = ( p). Once again we define f :Z, > F by f(m)=m-1, for meZ,, 
and leave it for the reader to verify that the above four properties again hold for F . Hence 
f is an isomorphism of Z, with a subset of F. 

The reason p must be a prime is that Z, is isomorphic to a subset of F , which, being 


a field, has no zero divisors. The uniqueness of p follows from Proposition 14.23. ™ 


Let f : Z— Z, be defined via 


f(x)=*¢{0,1,2}, %=x (mod 3). 


EXERCISES 373 
Let R= Z[i] and let J = (3). Define the function f : Z[i] — Z, via 
f(x)=%€{0,12}, *%=x (mod 3). 


Then f is a surjective homomorphism from R onto Z,. The kernel of this homomor- 
phism is {x ¢€Z|3| x} or (3). 
Let f : Z[i] — Z be defined as 


f(xtiy)=%¢{0,1,2}, =x (mod 3). 


Then 
Kerf = {x +iy | £=0 (mod 3) }. 


Let f(x +iy)=x+y. Then 


Kerf ={xtiyeZ|x+y=0}={xtiyeZ| y=—x}=(1-i). 


Exercises 14.4 


1. Prove that for any two positive integers m and n, Z,,,,/(m) =Z,,. 


2. Let @ be any Gaussian integer. Prove that Z[i]/(a) is finite. 

3- Find a complete residue system for (3 +i) in Z[i]. Use the residue system to create 
both an addition and a multiplication table for Z[i]/(3 +i). Is this a field? Justify 
your answer. 

4. Find a complete residue system for (3+ 2i) in Z[i]. Use the residue system to 
create both an addition and a multiplication table for Z[i]/({3 + 2i). Is this a field? 
Justify your answer. 

5- Let F denote the set of all the continuous functions f : [0,1] > R. 


(a) Prove that the subset of all the functions f € F such that f(0) =0 is a maximal 
ideal of F. 


(b) Find four other maximal ideals of F. 
(c) Find three ideals /,, /,, J, of F such that J, 21, 24. 


374 ABSTRACT RINGS 


6. Let & be the ring with underlying set 


+42) |abeR} 


whose addition and multiplication are the standard matrix operations. 
(a) Explain why these operations define a ring on R. 


(b) Explain why the subset 


r={(94)|beR} 


is an ideal of R. 


(c) Prove that R/J=R. 
7. Prove that the intersection of all the subfields of a fixed field F is a field. 


8. Complete the proof of Proposition 14.24. 


Chapter Summary 


Rings and ideals were defined abstractly. However, we met an old friend—the quotient 
structures that were so useful in the context of group theory turn out to provide just the 


right language for the rings and ideals as well. 
Chapter Review Exercises 


Mark the following true or false. 

1. Every ideal is principal. 

2. Every ring has at least two distinct ideals. 

3. Every ring has a nonprincipal ideal. 

4. Ifthe ideal J of the ring R is maximal, then R/J is a domain. 

5. Ifthe ideal 7 of R is such that R/J is a domain, then R/J is a field. 
The Gaussian integers have unique factorization. 
The Eulerian integers have unique factorization. 


The ring Z[+/—2] has unique factorization. 


Ney oes sats IN 


There exists a field with exactly 6 elements. 


10. For each positive integer greater than 1, there exists a ring of order 7. 


New Terms 


ascending chain of ideals, 364 


complete residue system, 370 


complex integers, 356 
Euclidean domain, 362 
Euclidean function, 362 
ideal, 358 

integers, 356 

integral domain, 361 
irrational integers, 356 
irreducible, 356 
irreducible ideal, 359 
isomorphic, 370 


maximal ideal, 368 


Supplementary Exercises 


CHAPTER SUMMARY 


prime, 357 

prime factorization, 365 
prime subfield, 370 
principal ideal domain, 363 
principal ideals, 358 
quadratic domains, 356 
rational integers, 356 

ring, 355 

ring isomorphism, 370 
unique factorization property, 365 
unit ideal, 358 


zero divisor, 361 


1. How many rings are there of order x where n = 1, 2,3? 


2. Is there a noncommutative ring of order 4? 


3- Is there a noncommutative ring of order 5? 


375 


A. Excerpts from Al-Khwarizmi’s Solution of the Quadratic 


Equation’ 


Containing Demonstrations of the Rules of the Equations of Algebra. 


.. furthermore I discovered that the numbers of restoration and opposition are composed 
of these three kinds: namely, roots, squares, and numbers.” However, number alone is 
connected neither with roots nor with squares by any ratio. Of these, then, the root is 
anything composed of units which can be multiplied by itself, or any number greater 
than unity multiplied by itself: or that which is found to be diminished below unity 
when multiplied by itself. The square is that which results from the multiplication of a 
root by itself. 

Of these three forms, then, two may be equal to each other, as for example: 

Squares equal to roots, 


Squares equal to numbers, and 


Roots equal to numbers.’ 
Chapter I. Concerning Squares Equal to Roots* 


The following is an example of squares equal to roots: a square is equal to five roots. The 
root of the square then is five, and 25 forms its square which, of course, equals five of its 
roots. 

Another example: the third part of a square equals four roots. Then the root of the 
square is 12 and 144 designates its square. And similarly, five squares equal ten roots. 
Therefore, one square equals two roots and the root of the square is two. Four represents 


the square. 


"Cardano, G., Ars Magna or The Rules of Algebra, T. Richard Witmer translator. 1968. Reprinted by 
permission of MIT press 

* The term “roots” (radices) stands for multiples of the unknown, our x; the term “squares” (substantiae) 
stands for multiples of our x?; “numbers” (numeri) are constants. 

3Jn our notation, x? =ax, x*=b, x=c. 

+Latin: de substantiis numeros coaequantibus. The examples are x” = 5x, x?/3 = 4x, 5x = 10x. 


Introductory Modern Algebra, Second Edition. 377 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


378 EXCERPTS 


In the same manner, then, that which involves more than one square, or is less than 
one, is reduced to one square. Likewise you perform the same operation upon the roots 


which accompany the squares. 
Chapter II. Concerning Squares Equal to Numbers 


Squares equal to numbers are illustrated in the following manner: a square is equal to 
nine. Then nine measures the square of which three represents one root. 

Whether there are many or few squares, they will have to be reduced in the same 
manner to the form of one square. That is to say, if there are two or three or four squares, 
or even more, the equation formed by them with their roots is to be reduced to the form 
of one square with its root. Further, if there be less than one square, that is, if a third or a 
fourth or a fifth part of a square or root is proposed, this is treated in the same manner. 

For example, five squares equal 80. Therefore, one square equals the fifth part of 
the number 80, which, of course, is 16. Or, to take another example, half of a square 
equals 18. This square therefore equals 36. In like manner all squares, however many, 
are reduced to one square, or what is less than one is reduced to one square. The same 


operation must be performed upon the numbers which accompany the squares. 
Chapter III. Concerning Roots Equal to Numbers 


The following is an example of roots equal to numbers: a root is equal to three. Therefore 
nine is the square of this root. 

Another example: four roots equal 20. Therefore one root of this square is five. Still 
another example: half a root is equal to ten. The whole root therefore equals 20, of which, 
of course, 400 represents the square. 

Therefore roots and squares and pure numbers are, as we have shown, distinguished 
from one another. Whence also from these three kinds which we have just explained, 
three distinct types of equations are formed involving three elements, as 

A square and roots equal to numbers, 

A square and numbers equal to roots, and 


Roots and numbers equal to a square. 
Chapter IV. Concerning Squares and Roots Equal to Numbers 
The following is an example of squares and roots equal to numbers: a square and ten 


roots are equal to 39 units.’ The question therefore in this type of equation is about as 


5This example, x? + 10x = 39, with answer x = 3, “runs,” as Karpinski notices in his introduction to this 
translation, “like a thread of gold through the algebras for several centuries, appearing in the algebras of Abu 


EXCERPTS 379 


follows: what is the square which combined with ten of its roots will give a sum total 
of 39? The manner of solving this type of equation is to take one-half of the roots just 
mentioned. Now the roots in the problem before us are ten. Therefore, take five, which 
multiplied by itself gives 25, an amount which you add to 39, giving 64. Having taken 
then the square root of this which is eight, subtract from it the half of the roots, five, 
leaving three. The number three therefore represents one root of this square, which itself, 
of course, is nine. Nine therefore gives that square. 

Similarly, however many squares are proposed all are to be reduced to one square. 
Similarly also you may reduce whatever numbers or roots accompany them in the same 
way in which you have reduced the squares. 

The following is an example of this reduction: two squares and ten roots equal 48 
units. The question therefore in this type of equation is something like this: what are the 
two squares which when combined are such that if ten roots of them are added, the sum 
total equals 48? First of all it is necessary that the two squares be reduced to one. But 
since one square is the half of two, it is at once evident that you should divide by two all 
the given terms in this problem. This gives a square and five roots equal to 24 units. The 
meaning of this is about as follows: what is the square which amounts to 24 when you 
add to it five of its roots? At the outset it is necessary, recalling the rule above given that 
you take one-half of the roots. This gives two and one-half which multiplied by itself 
gives 64. Add to this 24, giving 3014. Take then of this total the square root, which is, 
of course, 5¥2. From this subtract half of the roots, 22, leaving three, which expresses 


one root of the square, which itself is nine. 


Chapter VI. Geometrical Demonstrations® 


We have said enough, says Al-Khowarizmi, so far as numbers are concerned, about 
the six types of equations. Now, however, it is necessary that we should demonstrate 
geometrically the truth of the same problems which we have explained in numbers. 
Therefore our first proposition is this, that a square and ten roots equals 39 units. 

The proof is that we construct (Figure A.1) a square of unknown sides, and let this 


square figure represent the square (second power of the unknown) which together with its 


Kamil, Al-Karkhi and Omar al-Khayyami, and frequently in the works of Christian writers,” and it still graces 
our present algebra texts. The solution of this type, x? + ax = 4, is, as we can verify, based on the formula 
x= (a/2P +6-a/2. 

®For these geometric demonstrations, we must go back, as said, to Euclid’s Elements (Book VI, Propositions 


28 and 29; see also Book II, Propositions 5 and 6). See also on this subject the introduction to the Principal 
works of Simon Stevin. 


380 EXCERPTS 


Figure A.1 


root you wish to find. Let the square, then, be a, of which any side represents one root. 
When we multiply any side of this by a number (or numbers) it is evident that which 
results from the multiplication will be a number of roots equal to the root of the same 
number (of the square). Since then ten roots were proposed with the square, we take a 
fourth part of the number ten and apply to each side of the square an area of equidistant 
slides, of which the length should be the same as the length of the square first described 
and the breadth 21, which is a fourth part of 10. Therefore four areas of equidistant sides 
are applied to the first square, 2b. Of each of these the length is the length of one root 
of the square a6 and also the breadth of each is 2%, as we have just said. These now are 
the areas c, d, e, and f. Therefore it follows from what we have said that there will be 
four sides of unequal length, which also are regarded as unknown. The size of the areas 
in each of the four corners, which is found by multiplying 212 by 2%, completes that 
which is lacking in the larger or whole area. Whence it is that we complete the drawing 
of the larger area by the addition of the four products, each 24% by 24; the whole of this 
multiplication gives 25. 

And now it is evident that the first square figure, which represents the square of the 
unknown [x7], and the four surrounding areas [10x] make 39. When we add 25 to this, 
that is, the four smaller squares which indeed are placed at the four angles of the square 
ab, the drawing of the larger square, called GH, is completed (Figure A.2). Whence 
also the sum total of this is 64, of which eight is the root, and by this is designated one 
side of the completed figure. Therefore when we subtract from eight twice the fourth 
part of ten, which is placed at the extremities of the larger square GH, there will remain 
but three. Five being subtracted from eight, three necessarily remains, which is equal to 


one side the first square ab. 


EXCERPTS 381 


Figure A.2 


This three then expresses one root of the square figure, that is, one root of the proposed 
square of the unknown, and nine the square itself. Hence we take half of ten and multiply 
this by itself. We then add the whole product of the multiplication to 39, that the drawing 
of the larger square GH may be completed; for the lack of the four corners rendered 
incomplete the drawing of the whole of this square. Now it is evident that the fourth part 
of any number multiplied by itself and then multiplied by four gives the same number as 
half of the number multiplied by itself. Therefore, if half of the root is multiplied by itself, 
the sum total of this multiplication will wipe out, equal, or cancel the multiplication of 
the fourth part by itself and then by four. 

The remainder of the treatise deals with problems that can be reduced to one of the 
six types, for example, how to divide ten into two parts in such a way that the sum of the 
products obtained by multiplying each part by itself is equal to 58: x7 +(10— x) = 58, 


so that x = 3 or x =7. This is followed by a section on problems of inheritance. 


B. Excerpts from Cardano’s Ars Magna’ 


Chapter XL On the Cube and First Power Equal to the Number 


Scipio Ferro of Bologna well-nigh thirty years ago discovered this rule and handed it on 
to Antonio Maria Fior of Venice, whose contest with Niccold Tartaglia of Brescia gave 
Niccold occasion to discover it. He [Tartaglia] gave it to me in response to my entreaties, 
though withholding the demonstration. Armed with this assistance, I sought out its 


demonstration in [various] forms. This was very difficult. My version of it is as follows. 
Demonstration 


For example, let GH? plus six times its side GH equal 20, and let AE and CL be two 
cubes the difference between which is 20 and such that the product of AC, the side 
[of one], and CK, the side [of the other], is 2, namely one-third the coefficient of x. 
Marking off BC equal to CK, I say that, if this is done, the remaining line AB is equal 
to GH and is, therefore, the value of x, for GH has already been given as [equal to x]. 

In accordance with the first proposition of the sixth chapter of this book, I complete 
the bodies DA, DC, DE, and DF; and as DC represents BC?, so DF represents 
AB?, DA represents 3(BC x AB’) and DE represents 3(AB x BC”). Since, therefore, 
AC x CD equals 2, AC x 3CK will equal 6, the coefficient of x; therefore AB x 
3(AC x CK) makes 6x or GAB, wherefore three times the product of AB, AC, and 
BC is GAB. Now the difference between AC? and CK*—manifesting itself as BC?, 
which is equal to this by supposition—is 20, and from the first proposition of the sixth 
chapter is the sum of the bodies DA, DE, and DF. Therefore these three bodies equal 
20. 


Now, assume that BC is negative: 
AB? = AC? + 3(AC x CB’) +(—BC?)+3(—BC x AC”), 


"Cardano, G., Ars Magna or The Rules of Algebra, T. Richard Witmer translator. 1968. Reprinted by 
permission of MIT press 


Introductory Modern Algebra, Second Edition. 383 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


384 EXCERPTS 


Figure B.1 


by that demonstration. The difference between 3(BC x AC”) and 3(AC x BC’), how- 
ever, is [three times] the product of AB, BC, and AC. Therefore, since this, as was 
demonstrated, is equal to 6AB, add GAB to the product of 3(AC x BC’), making 
3(BC x AC’). But since BC is negative, it is now clear that 3(BC x AC”) is negative 


and the remainder which is equal to it is positive. Therefore, 
3(CB x AB’) +3(AC x BC*)+ GAB =0. 


It will be seen, therefore, that as much as is the difference between AC? and BC3, so 


much is the sum of 
AC? +3(AC x CB?) +3(—CB x AC*) +(—BC?) + GAB. 


This, therefore, is 20 and, since the difference between AC* and BC? is 20, then, by 


the second proposition of the sixth chapter, assuming BC to be negative, 
AB? = AC? + 3(AC x BC*) +(—BC?) + 3(—BC x AC’). 
Therefore, since we now agree that 
AB? +6AB = AC? +3(AC x BC*) + 3(—BC x AC”) + (—BC*) + 6AB, 


which equals 20, as has been proved, [AB? + 6AB] will equal 20. 


EXCERPTS 385 


Since, therefore, AB? + GAB = 20, and since GH? + 6GH = 20, it will be seen at 
once and from what is said in 1-35 and x1-31 of the Elements that GH will equal AB. 
Therefore GH is the difference between AC and CB. AC and CB, or AC and CK, 
the coefficients, however, are lines containing a surface equal to one-third the coefficient 


of x and their cubes differ by the constant of the equation. Whence we have the rule: 


Rule 


Cube one-third the coefficient of x; add to it the square of one-half the constant of 
the equation; and take the square root of the whole. You will duplicate this, and to 
one of the two you add one-half the number you have already squared and from the 
other you subtract one-half the same. You will then have a binomium and its apotome. 
‘Then, subtracting the cube root of the apotome from the cube root of the binomium, the 


remainder [or] that which is left is the value of x. 


For example, x? + 6x = 20. Cube 2, which is one-third of 6, making 8; square 10, 
which is one-half the constant; 100 results. Add 100 and 8, making 108, the square 
root of which is 108. This you will duplicate: to one add 10, one-half the constant, 
and from the other subtract the same. Thus you will obtain the binomium 108 + 10 
and its apotome ¥108—10. Take the cube roots of these. Subtract [the cube root of the] 


apotome from that of the binomium and you will have the value of x: 


V V108+10—¥ V108—10. 


Again, x? + 3x = 10. Cube 1, one-third of 3, and 1 results; square 5, one-half of 10, 
and 25 results; add 25 and 1, making 26; add 5 to and subtract it from the square root 
of this. You will thus form the binomium /26 +5 and its apotome V26—5; whence x 
equals 


V ¥264+5—V ¥26-5. 


386 EXCERPTS 


Here you have proof: 


Vv V26+5 and —y V26—5 


The cubes of the parts (as 
is evident, the sum of V26+5 and -(/26—5) 
these is 10): 


The squares of the parts: 51+ ¥2,600 and ¥/51— 2,600 


Three times the squares of 


¥/ 1,377 + 1,895,400 and v/ 1,377 — 71,895,400 


the parts: 


The parts themselves: -¥ ¥26—5 and , V264+5 
The products of the parts 49,299,354 + 6,885 — 47,385,000 — 7,020 


and three times their and 


squares: —V V49,299,354 — 6,885 — V/47,385,000 + 7,020 


Moreover, the cube roots contain four terms which can be reduced to two, for when 
6,885 is subtracted from 7,020, the remainder is 135, and likewise when ¥ 47,385,000 
is subtracted from 49,299,354 there is left 18,954. Therefore these products are 


V V'18,954— 135 — V 18,954 + 135. 


The whole cube, then, from the demonstration in the third book is 


10+ V /18,954— 135 — Vv +/18,954 + 135, 


and three times the root, or 3x, equals 


V 18,954 — 135 —V 18,954 + 135. 


Finally, having added all together, since the universal cube roots cancel each other, the 
whole becomes x? + 3x which equals exactly 10. 
A third example: x? +6x =2. Raise 2, one-third the coefficient of x, to the cube and 


8 is the result; square 1, half of 2, making 1; add 8 to 1, and 9 is produced, the square 


EXCERPTS 387 


root of which is 3. Now duplicate 3 and to one add 1, half the constant, thus making 4, 
and from the other subtract half the constant, thus making 2. Then subtract the cube 
root of the less from the cube root of the greater and you have 7/4 — V2 as the value of 
x. 

Remember what we said in the chapter in the third book on extracting cube roots 


whenever these universal cube roots are equivalent to a whole number or a fraction. Thus 


V 4108 + 10— VV V108—10 


is 2, as is indicated by the rule there given and as is perfectly clear if it is tried out. 


in the first example, 


C. Excerpts from Abel’s A Demonstration of the Impossibility 
of the Algebraic Resolution of General Equations Whose Degree 


Exceeds Four’ 


As is well known, it is possible to resolve the general equations up to the fourth degree, 
but the equations of higher degree only in some special cases, and, if] am not mistaken, 
the question “Is it possible to resolve in a general manner the equations whose degree 
exceeds four?” has yet to receive a satisfactory answer. It is the purpose of this memoir to 
respond to this question. 

The algebraic resolution of an expression is an expression of its roots as algebraic 
functions of the coefficients. It is therefore necessary to first consider the general from of 
algebraic functions, and then to see whether it is possible to satisfy the given equation by 


replacing the unknown with an algebraic function. 


III. On the number of distinct values that a function of several variables can assume 


when these variables are interchanged amongst themselves. 


Let v be a rational function of several independent variables x,, x,,...,x,,. The number 
of different values to which this function is subjected upon interchanging the quantities 


on which it depends cannot exceed the product 1-2-3---. Let u be this product. 


aBpys... 
nae 


be the value that some function v assumes when x,,x,,%,.*,..-. are substituted for 


Now let 


Xp Xp X,> Xp ++++ It is clear that when A,, A,,...,A, denote the u permutations that can 


be formed with the indices 1, 2,3,...,, the different values of v can be expressed as 


v av av peoegD : 
A, A, A, A, 
‘Journal fiir die reine und angewandte Mathematik (Crelle), Vol 1, Berlin, 1826 


Introductory Modern Algebra, Second Edition. 389 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


390 EXCERPTS 


Suppose that the number of distinct values of v is less than y, it is then necessary that 


some of the values of v be equal to each other, say 


a ae a ee), 


If the permutation denoted by 


is applied to these quantities we will have this new series of equal values: 


A _ A, ae A, = = A, 
oe ad ie tl OD cer ) 


values which are different from the first ones, but the same in number. By changing these 


quantities by the substitution denoted by 


on 
Ayn 


we obtain a new system of equal quantities which are, however, different from the pre- 
ceding ones. By continuing this process until all the permutations have been exhausted, 
the u values of v will be partitioned into several systems, each of which will contain m 
equal values. It follows from this that if the number of distinct values of v is represented 
by ge, a number that equals that of the systems, we have om = 1-2-3---n. That is, the 
number of distinct values that a function of 7 quantities can assume under all the possible 
permutations of these quantities is necessarily a divisor of the product 1-2-3---2. This 
is known. 


Now, let 


be an arbitrary substitution. Suppose that in applying it several times successively to the 
function v we obtain the sequence of values », v,, 0,,...,0 p+ It is clear that » will be 


necessarily repeated several times. When v returns after p substitutions we say that 


EXCERPTS 391 
is a recurring substitution of order p. We then have the following periodic series: 
V, Uys Vay eee Vy yr Vs Vy Upp ies 


wherein if 


represents the value of v which is obtained after r repetitions of the substitution denoted 


by 


we obtain the series 


It follows that 


A aptr A r A, ap A, 0 
v =v and v =v =p, 


However, if p is the largest prime no greater than 7, then if the number of distinct values 
of v is less than p, it must be the case that amongst p values some two must equal each 
other. 


It therefore follows that of the p values 


AN ead NP ae Ne A \P 
v »v »vV peoesD > 


Then 


392 EXCERPTS 


Writing r for r’+ p—r and noting that 


we conclude that 


where 7 is clearly not a multiple of p. The value of v is therefore not changed by the 


substitution 


where @ is an integer. If p is a prime number, then it is clearly always possible to find 


two integers a and such that ra = pB+ 1, hence 


A, pb+1 
v=v . 
a, 


Since 


it follows that 


of order p. 


However, it is clear that 


ie eee oe) 
and 
Byde...ne yaps...0n 


EXCERPTS 393 


are recurrent substitutions of order p when p is the number of indices a, B,...,. The 
value of v will therefore not be changed by the combination of these two. These substi- 


tutions are clearly equivalent to the single one 


(ean) 
ya Bp 


and this one to the following two, applied successively: 


a er) 


The value of v will therefore not be changed by the combination of these two substitutions. 


(5209) 
(23)(69) 
(5252) 


We see from this that the function v is unchanged by two successive substitutions 


Hence 


and similarly 


from which it follows that 


of the form . ar a and f being any two indices. If such a substitution is called a 
transposition, it may be concluded that any value of v will not be changed by an even 
number of transpositions, and that consequently all the values of v which result from 
an even number of substitutions are equal. Every exchange of the elements of a function 
can be effected by means of a certain number of transpositions; hence the function v can 
have only two values. The following theorem follows from this: The number of different 
values that a function of 7 quantities can assume is not less than the largest prime not 
exceeding 7, unless it is either 2 or 1. 

It is therefore impossible to find a function of five quantities which has three or four 
different values. The demonstration of this theorem is taken from a memoir of Mr. Cauchy 


which appears in the 17th volume of the Journal de l'école polytechnique, p. 1. 


D. Excerpts from Galois’s On the Theory of Numbers* 


When it is agreed to consider as zero all the quantities which are the multiples of a 
given prime number p, and, subject to this convention, one looks for solutions to the 
polynomial equation F x = 0, i.e., the equation that Mr. Gauss denotes by F x = 0, it is 
customary to consider only integer solutions to these sorts of questions. Having been led 
by some specific researches to consider their irrational solutions, I have arrived at some 
results that I believe to be new. 

Let there be given such an equation or congruence, Fx = 0, and let p be the modulus. 
Suppose first that the congruence in question admits no rational factors, that is, there 
exist no three polynomials gx, wx, xx such that px -yx = Fx + p-xx. In that case 
the congruence has no integer roots, nor any irrational root of smaller degree. One 
should therefore regard the roots of this congruence as some kind of imaginary symbols 
(since they do not satisfy the same questions as integers), symbols whose employment, in 
calculations, will often prove as useful as that of the imaginary ¥—1 in ordinary analysis. 

We are concerned here with the classification of these imaginaries and their reduction 
to the smallest possible number. 

Let i denote one of the roots of the congruence Fx = 0, which can be supposed to 


have degree v. Consider the general expression 
; Pes ye 
at+aitdai~t---+a,_,i” : (A) 


where 4, 4),45,...,4,_, represent integers. When these numbers are assigned all their pos- 
sible values, expression (A) runs through p” values which possess, as I shall demonstrate, 
the same properties as the natural numbers in the theory of residues of powers. 


Of the expressions (A), we shall only take the p” — 1 values obtained when 
Ce eT OnaeY Fa 


* Bulletin des Sciences mathématiques de M. Ferussac, Vol 13, Jun 1830, 428-435; with the following note: 
“This memoire forms part of the research of Mr. Galois on the theory of permutations and algebraic equations.” 


Introductory Modern Algebra, Second Edition. 395 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


396 EXCERPTS 


are not all zero; let a be one of these expressions. 

If a is successively raised to the second, third, ... powers, a sequence of quantities all 
of which have the same form is obtained (since every function of i is reducible to the 
(v—1)-th degree). Hence it must be that a” = 1 for some 7; let 7 be the smallest number 
such that a” = 1. Then the set of numbers 1, a,...,a”—! are all distinct. Multiply these 
n numbers by another expression of the same form. We then obtain another new group 
of quantities all different from the first group as well as from each other. If the quantities 
(A) have not been exhausted yet, the powers of a can be multiplied by a new expression y, 
and so on. Consequently, the number 7 necessarily divides the total number of quantities 
of type (A). Since this number is p” — 1, we see that 2 divides p’ — 1. From this it also 
follows that a? —! =1, or a?” =a. 

Next it can be proven, just as is done in the theory of numbers, that there exist primitive 
roots a for which p’ — 1 = 27, and which consequently reproduce, by their powers, the 
complete sequence of all the other roots. 

And any one of these primitive roots depends only on a congruence of degree v, a 
congruence which must be irreducible, since otherwise the equation of i could not be 
irreducible either, as the roots of the congruence in i are all powers of the primitive root. 

We note here the remarkable result that all the algebraic quantities that arise in this 
theory are roots of equations of the form x? = x. This proposition is stated algebraically 
as follows: Given a function Fx and an integer p, one can write fx -Fx =x? —x+ 
pox, fx and px being entire functions, whenever the congruence F x = 0 (mod p) is 
irreducible. 

If it is desired to express all the roots of such a congruence in terms of one, it suffices 
to note that in general (Fx)? = F(x? ) and that, consequently, if one of the roots is x 
then the others are x?, x?",...,x?" '.? We now show that, conversely, the roots of the 
equation or of the congruence x?” = x all depend on a single congruence of degree v. 

Let i bea root of an irreducible congruence and such that all the roots of the congruence 
x?" = x are rational functions of i (as is the case for ordinary equations, it is clear that 


this property holds here as well). 


*It would be wrong to conclude from the fact that the roots of the irreducible congruence F x = 0 of degree 


v are expressible as the sequence x, x?, x? * cee x?”™ chat these roots are always expressible by radicals. Here 
is an example to the contrary: The irreducible congruence x? + x + 1 = 0 (mod 2) yields x = (—1 + /—3)/2, 
which reduced to 0/0 (mod 2), from which formula we learn nothing. 

The general proposition in question here can be stated as follows: Given an algebraic equation, it is possible 
to find a rational function 6 of all of its roots such that, reciprocally, each of the roots is rationally expressible in 


EXCERPTS 397 


It is clear that the degree u of the congruence in i cannot be less than v, since otherwise 
the congruence 
xP =0 (B) 


would share all of its roots with the congruence x?" = 0, which is impossible, since the 
congruence (B) has no repeated roots, as is seen by taking the derivative of the first part. 
I claim that neither can yu exceed v. 

In fact, if that were the case, all the roots of the congruence x? “= x would depend 
rationally on those of the congruence x?” = x. But it is easily seen that if i?” = 7, then 


every rational function 4 = f(i) would yield 


(f(é)? = fi?) = Fl) 


from which 4?” = h. 

Hence all the roots of the congruence x?" = x would also be roots of the equation 
x?" = x, which is impossible. 

We therefore now know that all the roots of the equation x?” = x necessarily depend 
on only one irreducible congruence of degree v. 

Now, the most general method for obtaining this irreducible congruence on which the 
roots of the congruence x” = x depend, is to extract first out of this congruence all the 
factors that it shares with congruences of smaller degree and of the form x?" = x. 

One thus obtains a congruence which must factor into irreducible congruences of 
degree y. And, since it is known how to express all the roots of each of these irreducible 
congruences in terms of a single one, it will be easy to obtain all of them by Mr. Gauss’s 
method. 

Most frequently, however, it will be easy to find by trial and error an irreducible 
congruence of a given degree vy, and the rest must be derived from it. 


For example, let p = 7 and v = 3. Let us look for the roots of the congruence 
x” =x (mod 7). (1) 


6. This theorem was known to Abel as one can see in the first part of the memoir on elliptic functions which 
this celebrated geometer left. 


398 EXCERPTS 
I note first that since the congruence 
i?=2 (mod 7), (2) 


being irreducible, and of degree three, all the roots of congruence (1) depend rationally 


on those of congruence (2), so that all the roots of (1) have the form 
ataita,i? or atav2+5,V4, (3) 


It is now necessary to find a primitive root, that is, a form of expression (3) which, when 
raised to all possible powers, gives all the roots of the congruence x=] (mod 7), or 
x?3"19 — | (mod 7), and to accomplish this we only need a primitive root of each of the 
congruences x=, x? = l,and x!9 =1. 

The primitive root of the first is —1; those of the second are given by the equations 
x? =2 and x? =4, so that i is a primitive root of x? =1, 


It only reminds to find a root of x!? — 1 =0, or rather of 


and we first try to see whether the requirements can be satisfied by taking x = a+ a,i 
rather than a+ 4,i + 4)i*; we must have (a +<4,i)!? = 1, which, when developed by 
Newton’s formula, after reducing the powers of a, a,, and i by applying the formulas 


ame) = 1, are = 1, and 7° = 2, reduced to 
3 [a —a‘a3 + (a> a? +a°a?) i?] =1, 


from which, by separation, 3a — 3a4a? =1 and aa; + aa? =0. 

These last two equations are satisfied by setting a=—1 and 4, =1. Hence —1+i 
is a primitive root of x!? = 1. We found above that the values —1 and i are primitive 
roots of x? = 1 and x? =1 ; it only remains to multiply the three quantities —1, 7, and 
—1+i, and the product i —7 2 will bea primitive root of the congruence xP l=], 

Thus here the expression i — i” possesses the property that, in raising it to all powers, 
7? —1 distinct expressions of the form a +4,i +4,i are obtained. 

If we wish to find the lowest degree congruence on which our primitive root depends, 
it is necessary to eliminate i from the two equations i? = 2 and « =i—i*. One then 


obtains oc? —a+2=0. 


EXCERPTS 399 


We agree to take imaginaries as a basis and to denote by i the root of this equation, so 
that 
P~i+2=0, (C) 


and we will have all the imaginaries of the form a +4,i + 4,i* when i is raised to all of 
its powers and they are reduced by equation (C). 

The main advantage of this new theory that is propounded here is that it restores to 
congruences the property (so useful for ordinary equations) that they possess as many 
roots as there are units in their degree. 

The method of obtaining all of these roots is very simple. First, it is always possible to 
modify the given congruence Fx = 0 so that it does not have equal roots, or, in other 
words, so that it does not possess a common factor with F’x = 0, and the means for 
doing so are evidently the same as those for ordinary equations. 

Next, in order to obtain the integral solutions, it will suffice, as Mr. Libri seems to 
have been the first to remark, to look for the greatest common factor of Fx = 0 and 
XP tal, 

To find the imaginary solutions of the second degree, it is necessary to look for the 
greatest common factor of Fx =0 and x?! = 1, and, in general, the solutions of order 
v are given by the greatest common factor of Fx =0 and x?! = 1. 

It is above all in the theory of permutations, where it is often necessary to vary the 
form of the indices, that the consideration of imaginary roots of congruences seems 
indispensable. It provides a simple and easy method for recognizing in what case a 
primitive equation is solvable by radicals, as I will attempt to describe in a few words. 

Let fx =0 be an algebraic equation of degree p”; suppose that the p” roots are 
denoted by x,, where the index & assumes the p” values determined by the congruence 
kh?’ =k (mod p). 

Let V be an arbitrary rational function of the p” roots x,. Transform this function by 
replacing each index & by the index (ak + 6)?", a, 6, and r being arbitrary constants 
satisfying the requirements a?” = 1 (mod p), 6?” = 6 (mod p) and r integral. 

By assigning to the constants 2, 6, and r all their admissible values, we obtain a total of 
p’(p’ — 1) ways of permuting the roots amongst themselves by means of permutations of 
the form [%,, X(4444)P”], and the function V will in general assume p’(p” — 1)v different 
forms as a result of these substitutions. 

Assume now that the proposed equation fx = 0 is solvable by radicals, and, to prove 


this result, it suffices to note that the value substituted for #, in each index, can be 


400 EXCERPTS 


expressed in the three forms 
(ak+b)? =[a(k+b!)]? =a hk? + 6% =a'(k+ by. 


Those who are familiar with the theory of equations will see this easily. 

This remark would have had little significance had I not succeeded in showing that, 
conversely, every primitive equation that is solvable by radicals must satisfy the conditions 
I have just stated. (The equations of the ninth and twenty-fifth degrees are excepted from 
this rule.) 

Thus, for each number of the form p” it is possible to form a group of permutations 
such that every function of the roots that is invariant under the action of these permu- 
tations must admit a rational value when the equation of degree p” is primitive and 
solvable by radicals. 

Moreover, only equations of such degree p” are simultaneously both primitive and 
solvable by radicals. 

The general theorem I have just announced makes precise and develops the conditions 
that I specified in the Bulletin of the month of April. It indicates the means for forming 
a function of the roots whose value will be rational whenever the primitive equation of 
degree p” is solvable by radicals, and consequently it leads to a characterization of the 
solvability of these equations by means of calculations which, while perhaps not feasible 
in practice, are at least theoretically possible. 

Note that in the case v = 1 the various values of & consist of the sequence of integers. 
There are then p(p — 1) substitutions of the form (x,, x,,,,)- 

The function which, in the case of equations that are solvable by radicals, has a rational 
value depends, in general, on an equation of degree 1-2-3---(p—2), to which it is 


necessary, consequently, to apply the method of rational roots. 


E. Excerpts from Cayley’s The Theory of Groups* 


Substitutions, and (in connexion therewith) groups, have been a good deal studied; but 
only a little has been done towards the solution of the general problem of groups. I give 
the theory so far as is necessary for the purpose of pointing out what appears to me to be 
wanting. 

Let «, B,... be functional symbols, each operating upon one and the same number 
of letters and producing as its result the same number of functions of these letters; for 
instance, a(x, y, z) =(X, Y, Z), where the capitals denote each of them a given function 
of (x, y,2). 

Such symbols are susceptible of repetition and combination; a(x, y, z) =a(X,Y,Z), 
or Bax, y,z) = P(X, Y,Z), equal in each case three given functions of (x, y,z) and 
similarly a, a8, etc. 

The symbols are not in general commutative, a8 not equal to Ba; but they are as- 
sociative, aB-y = a- By, each of which is equal to «By, which has thus a determinate 
signification. [The associativeness of such symbols arises from the circumstance that the 
definitions of a, B,... determine the meanings of af, ay, etc.: if a, B,... were quasi- 
quantitative symbols such as the quaternion imaginaries i, j, and &, then af and By 
might have by definition values 5 and ¢ such that aB-y and a- By, equal to dy and ae, 
respectively, have unequal values.] 

Unity as a functional symbol denotes that the letters are unaltered, 1(x, y, z)=(x, y, 2); 
whence la=al=a. 

The functional symbols may be substitutions; a(x, y, z) =(y, z, x), the same letters 

yex 


in a different order: substitutions can be represented by the notation a= —, the 
XYZ 


substitution which changes x yz into yzx, or as products of cyclical substitutions, « = 
YeXx uw 
xyS uw 
and z into x; and u into w, w into u. 


=(x y z)( 4 w), the product of the cyclical interchanges x into y, y into z, 


"The American Journal of Mathematics, 1 (1878), 50-52. Reprinted with permission of The Johns Hopkins 
University Press. 


Introductory Modern Algebra, Second Edition. 401 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


402 EXCERPTS 


A set of symbols a, 8,... such that the product a of each two of them (in each 
order, af or Bor) is a symbol of the set, is a group. It is easily seen that 1 is a symbol of 
every group, and we may therefore give the definition in the form that a set of symbols 
1,a, B,... satisfying the foregoing condition is a group. When the number of the symbols 
(or terms) is equal to 7, then the group is of the m-th order; and each @ is such that 
a” = 1, so that a group of the order 7 is, in fact, a group of symbolical n-th roots of 
unity. 

A group is defined by means of the laws of combination of its symbols: for the statement 
of these we may either (by the introduction of powers and products) diminish as much 
as may be the number of independent functional symbols, or else, using distinct letters 
for the several terms of the group, employ a square diagram as presently mentioned. 

Thus in the first mode, a group is 1, 8, B?, a, aB, a8? (ce? =1, B =1, oB=f’ =a); 
where observe that these conditions imply also a6” = Ba. Or, in the second mode calling 


the same group (1, a, , y, 6, ), the laws of combination are given by the square diagram 


for the symbols (1, «, 8, 7,5, ) are in fact equal to (1, &, B, af, 8’, a). 

‘The general problem is to find all the groups of a given order 7; thus if 7 = 2, the only 
group is 1,a where a = 1; n = 3, the only group is 1, a, « where (a = 1); n =4, the 
groups are 1, a, a”, 0° with a‘ = 1 and 1,«, 8, a8 where a = 1, 6? = 1, and af = Ba; 
n = 6, there are three groups, a group 1, «, a”, @, a4, a° where «° = 1; and two groups 
1, B, 8, @, a8, oB" where ec? = 1 and #* = 1, viz. in the first of these o@ = Bec; while in 


the other of them (that mentioned above) we have af = Ba and af? = Ba. 


3 


"If n= 5, the only group is 1, a, a7, a7, a4 where @ =1. 


EXCERPTS 403 


But although the theory as above stated is a general one, including as a particular 
case the theory of substitutions, yet the general problem of finding all the groups of a 
given order 7 is really identical with the apparently less general problem of finding all 
of the groups of the same order 7, which can be formed with the substitutions upon 7 
letters; in fact, referring to the diagram, it appears that 1, a, 8,y,6,e may be regarded 
as substitutions performed upon the six letters 1, a 8, y,6,€, viz., 1 is the substitution 
unity which leaves the order unaltered, « the substitution which changes loByde into 
alyBes, and so for B, y, 6, and e. This, however, does not in any wise show that the 
best or easiest mode of treating the general problem is thus to regard it as a problem of 
substitutions: and it seems clear that the better course is to consider the general problem 


in itself, and to deduce from it the theory of groups of substitutions. 


Cambridge, 26th November, 1877. 


EK Mathematical Induction 


The Principle of Mathematical Induction (pM1) is not a theorem. It is a powerful method 
for proving theorems. Other such methods are proof by contradiction, argument by 
symmetry, and the pigeonhole principle. The following is probably the simplest form of 


this principle: 


Principle of Mathematical Induction, Version 1 Suppose thataset S of positive integers 
has the two properties that 1 € S and if ke S then £+1¢€S. The S consists of all the 


positive integers. 


This is eminently reasonable. By the first property, 1 ¢ S. The second property therefore 
implies that 2=1+1¢5. Now that 2€ S, it follows from the second property that 
3=24+1¢6€S, and similarly 4=3+1,5=4+1, and so on, are allin S. 


Let us see how this obvious principle can be used to prove a nonobvious fact. 


Theorem E1 If 7 is any positive integer, then 


n(n +1) 


14+24+3+4+---+n2= (E2) 


Proof: Let S be the set of all the positive integers » for which Equation F.z is valid. The 
PMI will be used to demonstrate that S consists of all the positive integers, which is, of 
course, tantamount to proving the proposition. 

It must first be shown that 1 ¢ S. In other words, it must first be shown that when 
n is replaced by the integer 1 in Equation F.2, a true statement is obtained. This, 
however, is easily verified, since this replacement transforms Equation F.2 to the statement 
1 = 1(1+1)/2, which is true. 

Second, it must be shown that the assumption & € S leads to the conclusion k+1¢€S. 
That & € S means that Equation F2 is valid when 7 is replaced by &, so that 


R(k+1 
14+24+3+4--+k= ( i 


(E3) 


Introductory Modern Algebra, Second Edition. 405 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


406 MATHEMATICAL INDUCTION 


To conclude that & +1 € S, it must be shown that 


(k+1)[(A4+1)+1] 


1424+3+---+k+(k+1)= 5 


This is demonstrated, using Equation F3, as follows: 


14+24+34--+k+(R+1)= 


Hee e+) 


k(k+1)+2(k+1) 
Z 

_ (A+ 1k +2) 

eer teas 

_ (e+ (k+1) +1) 

a Com. 


Thus the set S enjoys both of the properties required by the pm1, and so it consists of all 


the positive integers. In other words, Equation E.2 is valid for all the positive integers. m= 


The same method is now used to prove a well-known formula that is often demonstrated 


by other means. 


Theorem E4 If 7 is any number distinct from 1 and 7 is any positive integer, then 


ltrtr?+--tr%= (E5) 


l-r 
Proof. Let S be the set of positive integers for which Equation F.5 is valid. Our strategy 
will be to use the por to show that the set S consists of all the positive integers. 
For n = 1, Equation Es reduced to 1+ 7 =(1—77)/(1—1), which is true, since 
1—r?=(1—r)(1+7). Thus 1leS. 
Next, suppose that & ¢ S. In other words, suppose that 


y— rth 
ltrtre?t tris 


(E6) 


l-r 


MATHEMATICAL INDUCTION 407 


Making use of Equation F6, 
1— yet] 
Lertrr tent rk g tt! a ren aad 
—r 
Jari 4 (1—r)rétl 


l-r 
1— yktl + petl _ pktl 


l-—r 
k+l = (etl 


= CS 


l-r l-r 


which means that Equation F.5 holds for n = & + 1 as well. In other words, A+1¢€S. 
Thus the set S enjoys both of the properties required by the pai, and so it consists of all 


the positive integers. In other words, Equation F.5 is valid for all the positive integers. = 


Since the statement to be proved may involve several variables, it is advisable to begin 
a proof by mathematical induction by stating explicitly to which variable the process 
is applied. In any such proof, the verification that 1 € S is referred to as anchoring the 
induction, the assumption that & € S is called the induction hypothesis, and the part of the 
proof that uses the induction hypothesis to prove that &+1€ S is the induction step. 

In all proofs by mathematical induction, the set S consists of the set of all the integers 
for which a certain statement is true, and it is customary to leave the set S implicit and to 
speak simply of the set of integers for which the statement holds. With this convention 


in mind, the pm can be restated in the following manner: 


Principle of Mathematical Induction, Version 2. Suppose that a statement about pos- 
itive integers possesses the two properties that the statement is true for 1 and if the 
statement holds for some integer & > 1, then it also holds for & + 1. Then the statement 
in question holds for all the positive integers. 


It is this new version that is employed in the next example. 


Theorem E7 If x is any positive integer, then 


14+34+54+---+(2n-l)=n’. (F.8) 


Proof. We proceed by induction on 2. When 7 is replaced by 1, Equation E8 becomes 


1 = 1’, which is valid. Thus the induction process has been anchored. 


408 MATHEMATICAL INDUCTION 


Next suppose that Equation E8 is valid for the integer &. In other words, the induction 
hypothesis is 
14+34+54---+(2k-1)=h*. (Fo) 


Then 


14345+4---+(2k—1)+[2(4+1)—1] = 4? + [2(4 +1)—-1] 
=k? +2k+2-1 
=k? +2k+1 


=(k+1), 


and so Equation E8 also holds for & + 1. Thus Equation E8 possesses the second property. 


It follows from version 2 of the pm1 that Equation E8 holds for all positive integers. 


The choice of 1 as the anchoring point for the pM is merely a convention. Sometimes 
it is necessary to choose a different starting point, in which case this principle assumes a 


sightly different form. 


Principle of Mathematical Induction, Version 3 Suppose that a statement about inte- 
gers possesses the following two properties: The statement is true for the integer a. If the 
statement holds for some integer & > a, then it also holds for & + 1. Then the statement 


in question holds for all integers that are greater than or equal to a. 
Theorem E10 If x > 7, then (3/2)” >2n+1. 


Proof. The statement of this proposition makes it clear that the induction should be 
anchored at » = 7. For this value of 7, the inequality is (3/2)’ = 17.08 < 2-7+1, which 
is true. Next, assume that the inequality is valid for some integer & > 7. In other words, 


assume that & is an integer such that (3/2)* > 24 +1 and & > 7. Then 


3 k+l k 3 3 3 
(5) -(3\(5) Pe SEP ae sh akon ae wl il. 


2 2/\2 
Notice that in verifying these steps, it was necessary to make use of both the induction 
hypothesis (3/2)* > 24 + 1 and the assumption that & > 7. By version 3 of the pai, the 
inequality holds for all 2 > 7. 2 


MATHEMATICAL INDUCTION 409 


The above examples are all algebraic in nature, but the following examples come from 


calculus and geometry. 


Theorem E11 If f(x) is any differentiable function of x, let f’(x) denote its derivative 


with respect to x. Using that (x)/ = 1 and the product rule, 


[u(x)u(x)]/ = u/(x)o(x) + u(x)o’(x), 
(x")’ = nx"! for every positive integer 7. 


Proof, We proceed by induction on x. When n=1, (x!) =1 is given, and so the 


induction is anchored at 1. Assume next that (x”)' = nx”! is true for n = k. Then 
(x*tl)! = [x(x*)]/ = (x) (x*) + x(x?) = I(x) + x(hx!!) = x8 + kext = (Rk +1) x4. 


By the pat, (x”)/ = nx” for all positive integers 7. 2 


Theorem R12 Ina plane, any 7 straight lines, no two of which are parallel and no 


three of which are concurrent, divide the plane into (m* + » + 2)/2 regions. 


Proof. We proceed by induction on 2. When n = 1, (n* +n +2)/2 =2, and it is clear 
that any one line divides the plane into two regions. Thus the induction is anchored at 
n=l. 

Assume that this proposition holds for 2 = &, and suppose that we are given k + 1 
straight lines in a plane, no two of the lines being parallel and no three of them concurrent. 
Let one of these lines be labeled as g, and suppose that it is temporarily deleted. By the 
induction hypothesis, the remaining & lines divide the plane into (4? + k + 2)/2 regions. 
Restore the line g to its old position. Since g is not parallel to any of the other lines, 
and since it does not pass through any of their intersections, it follows that those lines 
cut g into & +1 sections (two of which happen to be infinite). Each of these sections 
divides one of the old regions into two new regions. This means that the restoration of 


q raises the region count by 2(4 + 1)—(£+1)=4+1. Consequently, the number of 


410 MATHEMATICAL INDUCTION 


regions determined by the given & + | lines is 


Re+k+2 ke +kh+24+2k+2 
Se eee 
2 2 
_ (B+ 2k+ 1) +(k+1) +2 
7 2 
_(k+1p t(kt+1)+2 
= 5 
Thus, by the pm1, we are done. . 


Sometimes, information about & does not transfer to information about &£+ 1. For 
example, the prime factorization of & sheds no light whatsoever upon the prime factor- 


ization of & + 1. For such cases, we have yet another form of mathematical induction. 


Principle of Mathematical Induction, Version 4 Suppose that a statement about inte- 
gers possesses the following two properties: The statement is true for an integer a. For 
any integer & > a, if the statement is true for 4,a+1,...,—1, then it is also true for &. 


Then the statement in question holds for all integers that are greater than or equal to a. 


Theorem E13 Every positive integer » > 2 can be expressed as the product of primes. 


Proof. We proceed by induction on n. Since 2 is a prime, 2 = 2 anchors the induction at 
n=2. Next, let & > 2 bea positive integer such that each of the integers 2, 3,...,#—1 can 
be factored into a product of primes. If & is prime, then & = & is the required expression. 
If & is not a prime, then & = ab for some integers a, 6 > 2. In that case we also have 
a,b < k—1, 80, by the induction hypothesis, both a and 6 are expressible as products 
of primes: a = p, p,-*- p, and b= 4,4,°+-9,. Then k = ab = py py*** p,9,9°*', is 
the required expression of & as a product of primes. Hence, by version 4 of the pm, each 


of the integers > 2 is expressible as the product of primes. . 


Principle of Mathematical Induction, Version 5 Every set of positive integers has a 
least element. 

Proof. We assume that version 4 of the pm holds and suppose, by way of contradiction, 
that S is a set of positive integers that does not have a least element. For each positive 
integer 2 let P(x) be the statement 7 ¢ S. Then P(1) is true because otherwise 1 < S 
and hence 1 would be a minimum of S, contradicting the assumption that S has no 


minimum. Suppose P(/) is true for 1 < 7 < &, or, in other words, forall j € {1,2,...,4}, 
J¢S. 


EXERCISES 411 


Now if £+1€ S it follows that & + 1 would then be the minimal element of 5. Hence 
k+1¢4S and it follows that P(& +1). Thus we have proved that 


P(1) holds; and 


= forall j¢{1,2,...,k}, P(j) => P(R+1). 


Since we are assuming version 4 of the PMI, it follows that P(v) is true for all positive 


integers n. This, however, implies that S is the empty set, contradicting the hypothesis 


that it is nonempty. Consequently 5 must have a minimal element. . 


Exercises E1 


Each of the examples in Exercises F.1.1 to E1.17 is to be proved by mathematical induction 


on the integer 7. 


I. 


2. 


3. 


4: 


5. 


16. 


17427437 4--.+n? =n(n+1)(2n+1)/6. 
13423433 4---4 23 =n?(n+1)/4. 


1-242-343-44+---+n(nt1l)=n(n4+1\(n+2)/3. 


l Ie 3228, 1 n 


12° 23 3-4 #7 n(nt1) ntl" 
1-2-342-3-443-4-54+-+n(n41)\(n+2) = n(n4+1)(n+2)n+3)/3. 


n> >5n+17 for n>8. 
2" >n? forn>5. 
3" > n? for n> 4. 


(4/3)" >1l+n forn>8. 


. lim x"e* = 0 forn>0. 


x00 


. fo xe * dx =n! forn>0. 

. wW+(nt+1)P+(n+2) is divisible by 9 for all n> 1. 

- cosacos 2a cos 4a-+-cos2”a = (sin(2”*!a))/(2”*! sina) for n > 0. 
. The integer 11”** + 12?”*? is divisible by 133 for all n > 0. 


- Given 7 planes, no two of which are parallel and no three of which contain the 


same straight line, they divide space into (? +52 +6)/6 portions. 


Every integer of the form 4” + 3 with 2 > 0 has a prime divisor of the same form. 


412 MATHEMATICAL INDUCTION 


M 


Figure Ex A geometrical division method 


17. Let ABCD bea trapezoid in which the nonparallel sides AB and CD intersect in 
a point M (see Figure F1). Define A, =A, B, = B, and for each positive integer 
k, let B,,, be the intersection of the lines CA, and BD, and let A,,, be the 
intersection of the lines MB,,, and AD. Then DA, = DA/n for all n> 1. 


18. Let 4)=4,=1anda,,,=4a,,,+5a, for 2 >0. Prove that a, < 3” forall n> 3”. 


G. Logic, Predicates, Sets, and Functions 


Propositions are assertions that may be either true or false. If p is a proposition, we write 
A(p) =T or A(p) =F according as p is true or false. 

Propositions can be compounded and propositional calculus is concerned with the 
questions that arise from this compounding. Given two propositions p and q, their 
conjunction is the compound proposition p and qg, denoted by p A q. For example, if 
p is the proposition —1 < 1 and q is the proposition —3 < ~—1 then we write p Aq as 
either —1 < 1 and —3 <—1 or, more briefly, as —3 <—1 < 1. The disjunction of p and 
q, denoted by p V q, is the proposition either p or g. Thus, for the same p and q as 
above, p V q is either —1 < 1 or -3 <—1. Similarly, if p is 10 is divisible by 2 and g 
is 10 is divisible by 3, then p Aq is 10 is divisible by both 2 and 3, and p V q is 10 is 
divisible by either 2 or 3 or both. 


G.1 Truth Tables 


We stipulate that the logical value of a compound proposition depends only on the logical 
values of its components. Thus, the conjunction p A q is true if and only both p and q¢ 
are true. On the other hand, p V q is true if and only if at least one of p and q are true. 
This is summarized in Table G.1. Such tables are called truth tables. Two compound 
propositions are said to be equivalent if they have identical columns in the appropriate 
truth table. This relationship is denoted by the = symbol. For example, Table G.2 
demonstrates that p \(q Vr)=(pAq)V(pAr). 
The symbol —p denotes the negation of p. It has the truth table in Table G.3. 


Introductory Modern Algebra, Second Edition. 413 
By Saul Stahl Copyright © 2013 John Wiley & Sons, Inc. 


414 LOGIC, PREDICATES, SETS, AND FUNCTIONS 


Po 9 pra pva 
F FF F 
F T F iy 
T FF T 
1 2k or T 


Table G.1 Truth tables of conjunction and disjunction 


Pog 6 qVr pr(qVr) prq parr (pAq)V(pAr) 
F F EF FE F F F F 
T F FF F F F F 
ian ae a F F F F 
poe. ae VT F F F F 
Te .3E T ay F T 
TF e- °E ae F a ay 
poe a F F F F 
Cer AE T F T T 


Table G.2 The proof of an equivalence 


PP. 
F T 
T F 


Table G.3 The truth table of negation 


EXERCISES 415 


P- @ Pag, OP SENG 
Bi, . oP. oP T 
FT T T T 
T FF F F 
rr 7 F T 


Table G.4 The truth table of implication 


Exercises G.1 


Prove the equivalences in Exercises G.1.1 to G.1.8 by the construction of appropriate 


tables. 

1. pVq=qVp 5) (pVqg)Ar=(pAr)V(qAr) 
2 pNq=qp 6. (pAq)Vr=(pVr)A(QVr) 
3. (pV q)Vr=pV(qVr) 7 “(pV q)=—pN-4 

4 (pANqg)Ar=pA(qar) 8. —(pAg)=—pV-4 


G.2 Modeling Implication 


It is necessary to define a propositional operator that models implication. ‘This is, after all, a 
mathematics textbook and implications and consequences are the business of mathematics. 
If p and q are any two propositions, then we denote the notion that “ implies g” by 
p => q. We now argue that the nature of implication is described by the (third column) 
of Table G.4. The first two rows together assert that a false proposition implies any 
proposition whatsoever. In other words, a false hypothesis can be used to draw any 
conclusion whatsoever. This may seem somewhat too sweeping a statement, but consider 
the following well-known anecdote: 

Lord Bertrand Russell, one of the pioneers of mathematical logic, was delivering a talk 
on this subject, when he was interrupted by someone in the audience who shouted to 
him “One equals two—prove you are the Pope!” Replied Russell: “The Pope and I are a 
set of two ....” To complete Russell’s reasoning, if you assume that one equals two, then 


the set consisting of him and the Pope is a one-element set from which it follows that the 


416 LOGIC, PREDICATES, SETS, AND FUNCTIONS 


Pope and Russell are one and the same. Thus, a false statement can imply a completely 
unrelated false statement. 

While it would be all right for a false statement to imply a false one (first row), can 
a false statement imply a true one (second row)? The answer is yes as shown by the 
following argument wherein an obviously false statement leads to an equally obvious true 
statement: 


-l=15(-1P=(1P S1=1. 


Consequently the first two entries in the third column of Table G.3, the ones that 
correspond to A( p) = F, A(q) = F and a(p) = F, A(q) = T, must be T. 

On the other hand, the derivation of a false statement from a valid one can never be 
tolerated under any circumstances whatsoever. It is inherent in human reasoning that 
any derivation of a false statement, such as —1 = 1, from valid hypotheses must contain 
an error. Hence the entry that corresponds to a(p) = T,A(q) = F must itself be false. 
This leaves the row corresponding to A(p) = T,a(q) = T. The entry must be the same 
regardless of the instances of p and q, as long as they are both true. The entry under 
p = @ is clearly T when p and g are identical propositions and so it must be T whenever 
p and q have the same truth value. 

The theorems, propositions, lemmas, and whatnots of mathematics are implications, 
i.e., have the form p => gq. When simplifying a complex expression of this type it usually 
comes in handy to massage the complicated expression until it is simplified. Rigor, 
however, must be preserved. 

The converse of the statement p => q is the implication gq => p. For example, if p is 
x =—1 and q is x? =1, then it is clear that p => q. The converse of this implication is 
q => p or if x? =1 then x = 1, which is clearly false. Hence the converse implication 
q => p is not necessarily valid. The implication ~p => —@ is the inverse of p => q which 
is also not logically equivalent since it asserts that if x # 1 then x? #1. This is evidently 
invalid as demonstrated by using x =—1. 

Thus, so far each time we messed with the aspects of the implication the logical value 
also was damaged. However, we do hit the jackpot for the contrapositive, namely the 
statement —q => —p. If we proceed, as before, to substitute the same x =—1 and x? = 1, 
then this contrapositive assumes the form x* # 1 => x #—1. We are now faced with the 
question of the validity of this implication. Contrary to the cases of the converse and 
the inverse, checking with x =—1 doesn’t lead to trouble and so the possibility that the 


contrapositive is valid is still open. We could try other values for x but the results would 


EXERCISES 417 


Figure G.1 — Euclid’s Axiom 


identity converse inverse contrapositive 


identity identity converse inverse contrapositive 

converse converse identity contrapositive inverse 

inverse inverse contrapositive identity converse 
contrapositive | contrapositive inverse converse identity 


Table G.5_—_Interrelationships among contrapositive, converse, and inverse 


be the same. So, there is hope that the contrapositive is logically equivalent to the original 
statement. In fact this can be easily proved by constructing the appropriate Truth Table. 

The reader might well wonder at all the bruhaha about the contrapositive. After all, all 
we are saying is that cwo implications are logically equivalent. However, psychologically 
speaking one of the implications may be more pliant or easy than the other and hence 
preferable. For example the implication x? # 1 = x #—1 is clearly more complicated 
and opaque than its contrapositive x = 1 => x? =1. Another example is provided by 
Euclid’s Axiom which states that if in Figure G.1 m || 2 => a= and its contrapositive is 
a # B>mNn # p. 

The notions of contrapositive, converse, and inverse have interesting interrelationships. 
To begin with, the successive application of two distinct notions results in the third one. 
For example, the converse of the inverse of p => q is the converse of =p => —g which is 
—q => —p, which is the contrapositive of p > q. 

In addition, it is clear that the combined effect of the application of two identical 
notions results in no change. These observations are summarized in the diagram of 
Table G.5. The reader may notice that this table is isomorphic to the multiplication table 


of the Klein 4-group. 


418 LOGIC, PREDICATES, SETS, AND FUNCTIONS 
Exercises G.2 


1. Prove that p> q=-q>-p. 
2. Prove that p => gq #q > p. 


3- Prove that p> q#-p> gq. 


G.3__ Predicates and Their Negation 


We now set out to explain the logic that underlies the intricate use of negation with which 
the proof of Proposition 11.5 begins. To begin with, we assume that the universe, or 
domain of possible values for each of our variables, is R. Let g(x) be a statement about 
the variable x that becomes either true or false when the variable x is replaced by a specific 
number. For example g(x) can be any of the following: g,(x): x <0, g,(x):x?+1=0, 
or ~3(x): x? —1=(x—1)(x +1), where “g(x) : &” means that g(x) is «. 

Note that ¢,(x) is true for some x’s and false for others. On the other hand ¢,(x) is 
false for all x’s, whereas ¢;(x) is true for all x’s. Let —p(x) denote the negation of g(x). 
Thus, 7, (x): x > 0, —p,(x): x? +1#0, and 7—,(x): x? -1 (x —1)(x +1). 

In general, the statement g(x) may be neither true nor false. As noted above, one 
way to turn it into a proposition that may be true or false is to substitute a real number 
for x. Thus, g,(—1):—1 < 0 is true; g,(1): 1 < 0 is false; g,(1): 17 + 1 =0 is false; and 
(-1) : (-1)?? — 1 =(—1—1)(-1 + 1) is true. 

Another way to turn the neutral statement g(x) into a proposition that is either true 
or false is via the application of quantifiers. Let x denote the phrase “there exists an x 


such that” and let Vx denote the phrase “for all x.” Then 
Ax @, (x) : there exists a number x such that x < 0, 


which is true, and 


Vx p,(x): every number x is negative, 


which is false. The quantifier 4 is known as the existential quantifier since it asserts the 
existence of some object in the universe. The quantifier V is the universal quantifier 
because it asserts that something is true for all the objects. The negation of quantified 
propositions follows the rules 4x g(x) = Vx p(x) and = Vx o(x) = dx (x). 

Thus, the negation of the proposition “all numbers are negative,” which is denoted 


by Vx @,(x), is the proposition “there is a number which is not negative”, denoted 


TWO APPLICATIONS 419 


by dx —¢,(x). Similarly, the negation of the proposition “there is a negative number”, 
denoted by >4x¢,(x) is the proposition “all numbers are nonnegative,” denoted by 
Vx —@, (x). 

These considerations become nonobvious when the number of variables is raised. 
Suppose o(x, y):x > y. We turn this into a proposition by quantifying both of the 
variables. Say, for starters, that the universal quantifier is applied to both x and y, so 
that we have the proposition Vx Vy g(x, y) whose negation, by two applications of the 


rules above, is 


=Vx« Vy (x,y) = 4x 7Vy o(x, y) = dx dy -9(x, ) 


where ~¢(x, y) means x < y. Somewhat less formally, the equation above says that the 

negation of the proposition x > y for all x and y is x < y for some x and some y. 
Next let us apply mixed quantifiers to g(x, y), the universal one first, so that we get the 

proposition 4x Vy g(x, y), which is “there exists a number x that is greater than every 


number y,” which is clearly false. The negation of this proposition is 


=x Vy ox, y) = Vx -Vyo(x, y) = Vx dy (x, 9), 


or “for every number x there exists a number y such that x < y.” This, of course, is a 
true proposition. 
Similarly the proposition Vx dy g(x, y) says “for every x there exists a y such that 


x > y,” which is true. The negation of this proposition is 


“Vx Ay Q(x, y) = 4x Jy o(x, y) = 4x Vy -9(x, y), 
which says that “there exists an x such that for all y, x < y” and which is clearly false. 


G.4 Two Applications 


Let x, a, and L be fixed real numbers and let ¢ and 6 be variables each of whose universes 
consists of the positive real numbers. Set p : |x —a| <6 and g:|f(x)—L|<e. Then 
the proposition “ f(x) converges to L in the e-6 sense” is represented by Ve 45(p => q). 
Hence when we assume that f(x) fails to converge to L we are assuming the negation 


—Ve( p = q), which is equivalent to 


de VS-(p => q) = de V8-(-p V gq) = de V8(p Ang). 


420 LOGIC, PREDICATES, SETS, AND FUNCTIONS 


‘This, however, translates to “there exists a positive e such that for all positive 6 |x —a| <6 
and |f(x)-L| 2.” 

Throughout the discussion above it was assumed that the universe of each of the 
variables x and y is R. This, of course, need not be the case. Of particular interest is 
the case where the universe of y is N, that is, the natural numbers 1, 2, 3,..., whereas ¢ 
can be any positive real number. For example, let {x,,} be a fixed sequence and define 


T,,, to be the subsequence 7), = x ... Consider the statement {x,,} > 


m+VP *m+7 *m43?* 


A whose definition is “every neighborhood of A contains all but a finite number of 
the x’ s,” or, equivalently, “for every ¢ > 0 there exists a natural number m such that 
T,, © (A-¢,At+e).” Setting g{e, m): T,, C(A—e, A+e) we obtain Ve dm gfe, m) as 


the formalization of the convergence {x,,} > A. The negation therefore is 


—Vedmole, m) = Je-Am ge, m) = deVm gle, m), 


2” 


ie., “there exists an ¢ such that for all m, T,,— ¢ (A—e,A+e) 
Exercises G.4 


In Exercises G.4.1 to G.4.4, interpret the propositions Vx dy g(x, y), dx Vy g(x, y), 
dx dy g(x,y), and Vx Vy o(x, y) for the given specification of g(x, y). Decide whether 


the resulting propositions are true or false. 


I. ox<y 2 x=ytl 3 x =y?tl 4. x+y? <1 


Use the results of this section to negate the following propositions. 
5. Every person has a progenitor. 
6. Every person has a biological child. 
7- Every real number has a square root. 
8. Every real number has a cubic root. 
9. Every rational number has a rational square root. 
10. Some real numbers have no real square roots. 
11. Some rational numbers have no real square roots. 


12. All the integer divisors of 12 are prime. 


SETS 421 


13. None of the people in this room are friendless. 

14. Every person in this room has a friend. 

15. There exist two integers whose sum is 5 and whose product is 8. 
16, The sum of any two integers is smaller than their product. 


17. Any two cosets are either equal or disjoint. 
G.5_ Sets 


Aset is a collection of distinct elements. Two sets are equal when they have exactly the same 
elements, regardless of the orders in which their elements are listed. Sets can be denoted in 
a variety of ways. For example, the set consisting of the integers 2, 3, and 4 is denoted by 
{2,3,4}. The set consisting of all the squares of positive integers is {1,4,9,...,7,...} 
or {n?|n=1,2,3,...}. The elements of a set have to be distinct from each other and if 
some repetitions occur they must be dropped. Thus { 7” | 2 =—2,—1,0, 1,2} = {0, 1, 4}. 

If the set A contains an element a, we can also say that A contains a and we write 
aéA. If a does not belong to A, this is denoted by a ¢ A. The empty set is the unique 
set which contains no elements. It is usually denoted by the symbol 0. The meaning of 
the universal set, or the universe, depends on the context. If we are working with integers 
the universal set is Z. On the other hand, if we are dealing with irrational numbers, the 
context may be either R or C. 

If A and B are sets, then A-B={ae Ala¢ B}. If A is any set, and the universal 
set is U, then the complement of A (in U) is A° ={aeU|a¢ A} =U —A. The union 
of the two sets A and B is AUB ={c¢|c¢A orc €B (or both) } and the intersection of 
Aand Bis ANB={c|c¢Aandc eB}. Ifevery element of A is also an element of 
B, we write A C B and say that A is a subset of B and that B is a superset of A. 


The equivalences below connect set theory to the propositional calculus: 


[x €(AUB)] =(x € A) V(x € B), 

[x €(ANB)] =(x € A)A(x €B), 

[x €(A—B)] =(x € A)A(x ¢ B). 
They allow us to derive set theoretic proofs from logical ones. As an example we prove 
the following theorem. 


Theorem G.1 AN(BUC)=(ANB)U(ANC). 


422 LOGIC, PREDICATES, SETS, AND FUNCTIONS 


Proof. For any x in our implicit universe 


xe AN(BUC)=(xeA)A(x Ee BVxeC) 
=(xeAAxeB)V(xeAAxeC) 


=xeANBVxe ANC =x e(ANB)U(ANC) 


and so, by the transitivity of equivalence, AN(BUC)=(ANB)U(ANC). . 


Exercises G.5 


Let A, B, C be subsets of the set S. In Exercises G.5.1 to G.5.9, prove the given identity 


using the method of your choice. 
1. AUB=BUA 6. (AUB) =A NB 


, =BNn 
ae ANA 7. (ANB) =A UBS 


3. AU(BUC)=(AUB)UC) 
4 AM(BNC)=(ANB)NC) 8. A-(B—C)=(A—B)U(ANC) 


5. AN(BUC)=(ANB)U(ANC) 9. A-(BUC)=(A-B)-C 
10. Let A be defined by AA B =(A—B)U(B—A). Prove that AA(BAC)=(AAB)AC. 


G.6 Functions 


Given two sets A and B, a function f : A— B isa rule that assigns to every element 
aé Aa unique corresponding element f(x) ¢B. The domain of this function f is A 
and its range is B. The codomain of f is the set of all the elements 6 € B for which there 
exists an a € 6 such that f(a) = 6. The codomain of f is denoted by f(A). The two 
functions f, g : A— B are said to be equal provided f(x) = g(x) for all x €.A. It is 
useful to represent functions f :.A— B by means of a diagram such as Figure G.2. 

For any set A we denote by /, the function J, : A — A such that J,(a) = a for all 
aéA, For any sets A, B, C and functions f : A— B and g : B > C, we define the 
composition go ff: A—>C as(go f\(a)= g(f(a)) forall ae A. 


Proposition G.2 The composition of functions is associative. That is, given functions 
f:A>B, g:B—>C,and h:C >D, then ho(gof)=(hog)of. 


FUNCTIONS 423 
4 4, 4, 4 as 4 a, 
b, b, b; b, bs b, 
Figure G.2_ A diagram of a function 


Proof. By several applications of the definition of composition it follows that for any 
aéeA 


(ho(go f)\a)=A((g 0 f(a) = A(g(F(4))) =(40 gf (a) = (Fe g)0 fla). 
The desired conclusion then follows the definition of equality of functions. a 
Proposition G.3 Let f: AB. Then/,0f=f=fol,. 


Proof. For any ae A and be B 


(lp 0 f(a) = Ip(F (2) = F (2) = Fa) = (F 0 LN). 
The definition of the equality of functions now clearly implies the proposition. aw 


The function f : A — B is said to be injective if it has the property that for any a, a’ € A 
if a#a’ then f(a) # f(a’), or, equivalently, if f(a) = f(a’) then a =a’. In terms of 
the arrow presentation of functions, f is injective if no distinct arrowtips meet. The 
function f :Z—> Z defined by f(m) = m? is injective as are _f,(m) = m?*—! for every 
positive integer &. On the other hand, the function f(m) = m?* fails to be injective since 


f(-3)=9 = f(3). 


PropositionG.4 The function f : A— B is injective if and only if there exists a function 
g:B—Asuch tha gof=J/,. 

Any function g that satisfies g o f =I, is said to be a left inverse of f . Thus, Propo- 
sition G.4 can be rephrased as 


Proposition G.5 The function f : A — B is injective if and only if it has a left inverse. 


Proof. Assume first that f :A— B has a left inverse g. If a, 6 are in A and f(a)= f (8), 
then 


=(g 0 f)(a)= g(f(4)) = s(f(4)) =(g 0 fe) = 4 


424 LOGIC, PREDICATES, SETS, AND FUNCTIONS 


axeA B 


Figure G.3__ A left inverse 


and hence / is injective. 

Conversely, suppose f is injective. We define a function g : B — A as follows (see 
Figure G.3): 
% if f(x) = 9; 


any element of A _ otherwise. 


g(y)= 


We now demonstrate that g isa left inverse of f. If a <A, then, by the definition of g, 


g(f(a)) =a and hence g o f is the required left inverse. . 


A function f : A — B is said to be surjective if it has the property that f(A) = B, or, 
in other words, for each 6 € B there exists an a € A such that f(a) = &. In terms of the 


arrow presentation, every element of B is the target of some arrow. 


Proposition G.6 ‘The function f : A — B is surjective if and only if there exists a 
function 4: B > A such that foh= Jp. 


Any function / that satisfies f o 4 = I, is said to be a right inverse of f. Thus, the 


proposition above can be rephrased as 


Proposition G.7 ‘The function f : A — B is surjective if and only if it has a right inverse. 


Proof: Assume f : A— B is surjective. If y € B there exists an x € A such that f(x) = y 
and we define /(y) to be any x* such that 4(y) = x*. Then 


(Fo ANy) = F(A) = F(x") = 9 = La(y) 


and hence foh=TI,. 
Conversely, suppose f : A — B has the right inverse 4: B > A, so that foh=TIp. 
Then f(4(6)) = 6 so that f maps /(4) onto the arbitrary 6 (see Figure G.4). s 


A function is said to be bijective if it is both injective and surjective. 


Proposition G.8 ‘The function f : A > B is bijective if and only if there exists a (nec- 


essarily unique) function g : B > A such that go f =/, and fog =J,. 


FUNCTIONS 425 


Figure G.4 A right inverse 


A bijective function o : A= B is called a permutation. 


Proposition G.9 Let f : A— A be A bea finite set. Then the following are equivalent: 
f is bijective; f is injective; f is surjective. 


Proof. See Exercise G.6.15. 2 


A relation R of the set A is a rule that assigns some elements of A to some elements 
of A. If the element 6 € A is assigned to A, we write a R 6. A relation R on the set A 
is an equivalence relation if the following three properties hold for any three elements 
a,b,céA: 

* aRa_ (Reflexivity); 

* aRb implies Ra (Symmetry); 

*aRbandbReimplyaRe  (Transitivity). 
Let R be the relation on the points of the plane defined by aR 6: a= 6. Since for 
all points 2, 6, and c, a=a, (a= 6)=>(b =a), and (a= 6)A(b=c)>(a=0), it 
follows that this R (better known as equality of points) is an equivalence relation. Similarly, 
equality of length is an equivalence relation on line segments in either the real line or the 
plane. 

‘The raison d étre of relations is to formalize and clarify the notion of a coset. Let G 
be a group and H a subgroup of G. Define 2 R 6 provided a'b<¢ H. Now aRa 
because a7'a = 1, € H. It is clear that the following are equivalent: a R 6; a6 € H; 
(a-'b)' eH; b'aeH; bRa. Finally, if aR 6 and 6Rc, then a!b eH and 


6-'c € H. Multiplication of these equations yields 
a'c=a'bb'ceHH=H 


andsoaRce. 


426 LOGIC, PREDICATES, SETS, AND FUNCTIONS 


Proposition G.10 Let R be an equivalence relation on the set A. Then R defines a 
partition .of = { A, |A € A} of A such that a R @ if and only if a and 6 belong to the 
same member of .. 

Proof. For each a A let [a] denote the set of all the elements 6 € A such that aR 6. 
The reflexivity of R implies that for all a¢ A a¢[a]. The symmetry of R implies the 
equivalence of the statements a ¢ [6] and 6 € [a]. These sets have the property that for 
each pair [a] and [6] of such sets, either [2] M[6] =O or [a] = [4], for, if [24] [4] #9, 
say, c € [a] [4], then c € [a] and c € [4]. Consequently [a] = [c] = [4]. Conversely, 
if a and 6 belong to the same member of .¢@, then [a] = [4] and hence aR 6. 2 


Exercises G.6 


In Exercises G.6.1 to G.6.12, decide whether the relation R on the set A is an equivalence 


relation. If yes, describe the equivalence classes. If not, then decide which property is 


violated. 
1. A: people, R: brother of 7 Az={1,2,---,21}, Ri: a+ is odd 
2. A: people, R: sibling of 8. A={I1,2,---,21}, Ri:a+6 is odd 


3. Az={1,2,---,21},R:at+biseven 9. A={I1,2,---,21}, R: |Ja—6|=3 
4. A={1,2,---,21},R:at+biseven 10. A={I1,2,---,21},R: |at+6|=2 
§- Az={1,2,---,21}, R: a—6 is even 11. A=Z,R: x—y is divisible by 2 


6. A={1,2,---,21},R:a—biseven 12. A=Z,R: x—y is divisible by 3 


13. Prove that the composition of two injective functions is also injective. 
14. Prove that the composition of two surjective functions is also surjective. 


15. Prove Proposition G.9. 


Biographies 


Niels Henrik Abel (1802-1829). Despite his early death from consumption, the Norwe- 
gian Abel exerted a major and lasting influence on the evolution of mathematics. He is 


best known for his work on the quintic equation and elliptic integrals. 


Muhammad ibn Musa al-Khwarezmi (ca. 780-850). Al-Khwarezmi was on the faculty 
of the House of Wisdom, a scholarly institute in the city of Baghdad. The author of 
several mathematical and astronomical works, his name eventually gave rise to the term 
algorithm, and the first two syllables of his text al-jabr wa‘al-mugabala mutated into the 


term algebra. 


Archimedes (287-212 BCE). The Greek Archimedes was the greatest of the scientists, 
mathematicians, and engineers of antiquity. He lived in the city of Syracuse in Sicily. 
Among his accomplishments are the formulation of a precise theory of flotation, the 
computation of the volumes of many solids, including the sphere, and the construction 


of a variety of engines to defend his city against the besieging Romans. 


Rafael Bombelli (1526-1572). An Italian engineer by profession, Bombelli also wrote 
a widely read treatise called Algebra. This book contains the first known attempt to 


systematize complex numbers. 


Gerolamo Cardano (1507-1576). A physician by profession, the Italian Cardano wrote 
the very influential text Ars Magna which included the solutions to the general cubic and 


quartic equations. 


Augustin-Louis Cauchy (1789-1857). The Frenchman Cauchy was the most prolific 
of all the mathematicians of the nineteenth century. While he worked in many areas 
of mathematics, he is best remembered today as the founder of the theory of complex 


variables and also for his contributions to the rigorization of calculus. 


Arthur Cayley (1821-1895). Cayley was the most productive English mathematician to 
follow Newton. He made many contributions to geometry, algebra, and analysis, but is 


best known for his work on invariant theory. 


427 


428 BIOGRAPHIES 


Richard Dedekind (1831-1916). Dedekind was a German mathematician noted for his 
contributions to the rigorization of both analysis and algebra. This was facilitated in both 
cases by the fortuitous idea that missing numbers can be represented by sets of known 


numbers—Dedekind cuts in analysis and ideals in algebra. 


Gotthold Eisenstein (1823-1852). Like his contemporaries Galois and Abel, Eisenstein 
died young. His death was at least partially due to his one-day arrest by the Prussian 
army on suspicion of Republicanism. Nevertheless he was prolific (23 papers in 1842 
alone) and Gauss thought very highly of his mathematical talents. He contributed much 


to the development of the new mathematical field of algebraic number theory. 


Euclid (3rd century BcE). Euclid was a Greek who lived in Alexandria. He wrote several 
books of which the best known is The Elements. This textbook on geometry and number 


theory is arguably the most influential scientific tract of all time. 


Leonhard Euler (1707-1783). Together with the Frenchman Lagrange, the Swiss Euler 
completely dominated the mathematical developments of his time. He made fundamental 
contributions to all areas of mathematics, including some which did not begin to flourish 


until a century after his death. 


Pierre de Fermat (1601-1655). Known as the “Prince of Mathematics,” the Frenchman 
Fermat was in fact a lawyer who regarded mathematics as a diversion. He did pioneering 


work in calculus and probability, and set number theory on a course it still follows. 


Lodovico Ferrari (1522-1565). A student of Cardano, Ferrari was the first to derive a 


formula for the solution of the general quartic equation. 


Scipione del Ferro (1465-1526). The Italian del Ferro was the first to succeed in solving 


cases of the cubic equation that had eluded both the Greek and Islamic mathematicians. 


Evariste Galois (1811-1832). The Frenchman Galois possessed one of the most original 
mathematical minds of all time. Despite his untimely death in a duel, Galois’s work on 
the solvability of equations eventually overshadowed that of his illustrious contemporaries 
Abel and Cauchy. He completely resolved the issue of solvability of equations and in the 


process created the modern mathematical discipline of group theory. 


Carl Fridrich Gauss (1777-1855). The German Gauss is generally considered to be the 
greatest of the mathematicians to follow Newton. His profound work influenced the 
subsequent evolution of all the major areas of mathematics. In addition, he also made 


important contributions to astronomy, physics, and geodesy. 


BIOGRAPHIES 429 


Camille Jordan (1838—1922). A French mathematician who is best known for his 


pioneering work in group theory and linear algebra. 


Omar Khayyam (1048-1131). Better known for his collection of poems Rubaiyat, the 
Persian Khayyam also wrote several mathematical, astronomical, and philosophical tracts. 
He tried to systematize the solution of the cubic equation and actually solved several 


special cases. He is also known for his work on Euclid’s Parallel Postulate. 


Felix Klein (1849-1925). Klein was a highly influential German mathematician. His 
paper that came to be known as the Erlanger Programm focused the attention of math- 
ematicians on the applications of group theory to geometry. He was also one of the 


pioneers of hyperbolic geometry and the calculus of complex variables. 


Ernst E. Kummer (1810-1893). Kummer lived and worked in Germany all his life. He 
began his mathematical career as a high school teacher. His main mathematical work was 
in the area of algebra where he picked up where Gauss left off on the topic now known 
as algebraic number theory. He invented the term “ideal element,” which eventually led 


to the modern formulation in the language of rings and ideals. 


Joseph Louis Lagrange (1736-1813). Lagrange was born in Italy but spent most of his 
adult life in France, from where his ancestors had come. He made vast contributions to 
the fast evolving disciplines of calculus and differential equations. His work on algebraic 


equation, linear algebra, and number theory also proved very fecund in the long run. 


Adrien-Marie Legendre (1752-1833). Legendre was a highly influential French mathe- 
matician. He made substantial contributions to geometry, analysis, and number theory, 


in each of which areas he also wrote a definitive text. 


Isaac Newton (1642-1727). An English scientist whose creativity influenced the evolu- 
tion of both mathematics and physics more than that of any other individual. Among 
his many achievements are his theories of light and gravitation, and the development of 


calculus. His best known book is the Principia Mathematica. 


Blaise Pascal (1623—1662). Pascal was a French mathematician, philosopher, and scientist. 
He invented the first adding machine and made major contributions toward the evolution 


of geometry and the theory of probability. 
Joseph Raphson (1648-1715). An Englishman whose tract Analysis Aequationum Uni- 


versalis described Newton’s numerical method for solving equations. 


430 BIOGRAPHIES 


Paolo Ruffini (1765-1822). Ruffini was an Italian physician who published a proof of the 
unsolvability of the general quintic equation by radicals. While his proof was incomplete, 
it did contain some ideas that were eventually incorporated into Abel’s proof of the same 


theorem. 


Niccold Tartalia (1499-1557). An Italian mathematician whose work on some cubic 
equations, together with that of his compatriots del Ferro and Cardano, resulted in the 


complete solution of the cubic equation. 


A. T. Vandermonde (1735-1796). A French mathematician whose pioneering work on 


the algebraic solution of equations was eclipsed by that of Lagrange. 


Bibliography 


Abel, N. H., Oeuvres compleetes de Niels Henrik Abel, Christiana, Gropdahl, 1881. 
Bell, E. T., Men of Mathematics, Simon and Schuster, New York, 1965. 


Birkeland, B., Ludwig Sylow’s Lectures on Algebraic Equations and Substitutions, Christiana, Oslo, 


1862: An Introduction and a Summary, Historia Mathematica, 23 (1996), 182-199. 

Birkhoff, G., and Mac Lane, S., A Survey of Modern Algebra, 4th ed., Macmillan, New York, 1977. 
Bombelli, R., Algebra, Feltrinelli, Milan, 1966. 

Borofsky, S., Elementary Theory of Equations, Macmillan, New York, 1959. 

Cajori, F., An Introduction to the Theory of Equations, Macmillan, New York, 1969. 

Cardano, G, Ars Magna, Dover, New York, 1993. 

Cauchy, A. L., Oeuvres completes, Gauthier-Villars, Paris, 1882. 


Cauchy, A. L., Mémoire sur les premiers termes de la série des quantités qui sont propres a représenter 
le nombre des valeurs distinctes d’une fonction des » variables indépendantes, Comptes Rendus 


Paris, 21 (1845), 1093-1101. 


Cauchy, A. L., Mémoire sur une nouvelle théorie des imaginaires, et sur les racines symboliques 


des équations et des equivalences, Comptes Rendus Paris, 24 (1847), 1120-1130. 

Dickson, L. E., New First Course in the Theory of Equations, Wiley, New York, 1939. 
Dickson, L. E., Linear Groups, Dover, New York, 1958. 

Edwards, H. M., Galois Theory, Springer, New York, 1984. 

Euclid, The Elements, Dover, New York, 1956. 

Fermat, P. de, Ocuvres de Fermat, ed. P. Tannery and C. Henry, Gauthier-Villars, Paris, 1922. 
Gallian, J. A., Contemporary Abstract Algebra, 2nd ed., D. C. Heath, Lexington, Ma, 1982. 


Galois, E., Ecrits et mémoires mathematiques, ed. R. Bourgne and J.-P. Azra, Gauthier-Villars, Paris, 


1962. 
Galois, E., Sur la théorie des nombres, Bulletin des Sciences mathématiques, 428 (1830). 


Gauss, C. F, translated by Arthur A. Clarke: Disquisitiones Arithmeticae, Yale University Press, 
1965. 


431 


432 BIBLIOGRAPHY 


Gillings, R. J., Mathematics in the Time of the Pharaohs, Dover, New York, 1972. 


Hadlock, C. R., Field Theory and Its Classical Problems, Mathematical Association of America, 
Washington, Dc, 1978. 


Hahn, L., Complex Numbers and Geometry, Mathematical Association of America, Washington, 
DC, 1994. 

Hall, H. S., and Knight, S. R., Higher Algebra, Macmillan, London, 1919. 

Herstein, I. N., Topics in Algebra, 2nd ed., Wiley, New York, 1975. 

Hungerford, T. W., Algebra, Springer, New York, 1974. 

Jordan, C., Traité des substitutions et des equations algebriques, Gauthier-Villars, Paris, 1870. 

Katz, V.,.A History of Mathematics: An Introduction, HarperCollins, New York, 1993. 


Kiernan, B. M., The development of Galois theory from Lagrange to Artin, Arch. Hist. Exact Sci., 
8 (1971), 40-154. 
Kleiner, I., The evolution of group theory: a brief survey, Math. Mag., 59 (1986), 195-215. 


Kline, M., Mathematical Thought from Ancient to Modern Times, Oxford University Press, New 
York, 1990. 


Knopp, K., Problem Book in the Theory of Functions, Dover, New York, 1948. 
Lagrange, J. L., Oeuvres de Lagrange, Gauthier-Villars, Paris, 1867-1892. 


MacWilliams, F. J., and Sloane, N. J. A., The Theory of Error-Correcting Codes, North-Holland, 
Amsterdam, 1977. 


McCarthy, P. J., Algebraic Extensions of Fields, Dover, New York, 1991. 

Needham, J., Science and Civilization in China, Cambridge University Press, Cambridge, 1959. 
Serret, J. A., Cours d algebra superieure, Gauthier-Villars, Paris, 1885. 

Shanks, D., Solved and Unsolved Problems in Number Theory, Spartan Books, Washington, 1962. 
Stauduhar, R. P., The Determination of Galois Groups, Mathematics of Computation, 27 (1973), 
981-996. 

Stewart, I., Galois Theory, Chapman and Hall, London, 1989. 

Story, W. E., Note on the “15” Puzzle, Amer. J. Math., 2 (1879), 399~404. 

van der Waerden, B. L., Modern Algebra, F. Ungar, New York, 1953. 

Wells, D., The Penguin Dictionary of Curious and Interesting Numbers, Penguin, Great Britain, 1986. 


Wussing, H., The Genesis of the Abstract Group Concept, mT Press, Cambridge, 1984. 


Solutions to Selected Exercises 


13 (L£V7)/6 11.9 (3abe—63)/a> i111 (6?—4ac)/a? 1.1.13 —ab/c? 
L115 x?+(p—q)x—pq=0 1.1.17 a<Oora24 i021 True 1.4.3 False 


& 


2.1.4 Argument: arctan(3/2) = 56.3°; modulus: V2? +3? = /13 3.6. 2.1.3 Argument: 


180° + arctan(4/3) © 233.1°; modulus: «/(—3)? +(—4)? =5. 2.1.5 7+2i 2.1.7 -34+4i 
i 24; _3¥3 4 13; pale reer = i 

2.1.9 134131 2.1.11 5g t agi BeE-13 St Si 2S —>— Si 21-17 7 +24i 

2.1.19 i 2.1.21 —i 2.1.23 2=—-7—3i 2.1.25 w=z=i 

ij 1 ate oy, a eh apt 

a aa aio fa iP ari a lt 
Wyiji_v3,i v3 _i | VB_i ar 

2.2.3 { ah ei eS ay 2.2.5 +(2—i) 


2.2.7 {1.08 +.29i,—.79 +.79i, —.29 — 1.081} 2.2.9 {,-8-5, 8-4} 2.2.11 +(¢ +i) 


2.2.15 {-i,(-2+i)/5} 2.2.17 {-i,-3-2i} 2.2.23 -1 2.2.25 057 2.3.1 rational 


2.2.1 { z + 


2.3.3 degree 2 algebraic 2.3.5 rational 2.3.7 not algebraic 2.3.9 degree 2 algebraic 

2.3.11 not algebraic 2.4.7 102, 120, 128, 136, 160,170,192 2.4.13 constructible 

2.4.15 constructible 2.4.17 constructible 2.4.19 not known to be constructible 2.5.1 elements 
i, ~1, —i, and 1 with orders 4, 2, 4, and 1, respectively. 2.5.3 elements —w*, w, —1, w, —w, 
and 1 with orders 6, 3, 2, 3, 6, and 1, respectively. 2.5.5 elements (1 +i)/V2, i, (-1+i)/V2, 
—1, (-1—i)/ 72, -i, (1-i)/V2, and 1 with orders 8, 4, 8, 2, 8, 4, 8, and 1, respectively. 

2.5.7 Let {= cos(2x/10)+isin(2x/10). Then V1 has elements Z, ¢?, 23, 24, £5, 0, 27, 28, 
2°, and 1, with orders 10, 5, 10, 5, 2, 5, 10, 5, 10, and 1, respectively. 2.5.9 —w* and w 

2.5.11 0, 0°, 27, and ¢"! where { = cos(2x/12)+isin(2x/12) 2.1 true 2.1.3 false 2.1.5 true 


2.4.7 false 2.2.9 true 2.4.11 false 2.1.13 true 
ae 


36E.1 x, = 377 —9/(3- 37/9) = 373 —3'3, x) = 37! — 0733, x, = 0737 — W938 
3.1.3 x, = 67/3 —18/(3- 67/9) = 673 — 61, x, = 067? — 06", x, = 06/3 — w6'9 
315 x, = y,—4/3 = 277 —6/(3-27/9)—1 = 27/3 —2'8 —1, x) = 027 — 02/3 - 1, 
x3 = 27/3 — @2!/3 —]1 
3.1.7 x, = y,—4/3 =—2) iw —(—6)/(3 -(—2"7 iw) — 1 = —2'Fiw + 27iw” — 1, 
xy = —2'Fie? + 27Fiw — 1, x, =-2714+ 2171-1 
31.9 x, =y,-1/2=1/2—(3/4)/(3-1/2)-1/2=-1/2, 
433 


434 SELECTED SOLUTIONS 


% = a1 /2)—0°(3/4)/(B-1/2)— 1/2 = (ow? )/2=e, 

x, = @°(1/2)—0(3/4)/(3-1/2)-1/2 = (w? —w—-1)/2=0? 3.1.11 -1—i 3.3.1 x, =0, 
X) = —.6667, x, =—.5968, x, = x,=—.5958 3.3.3 x, =0, x, =—.85, x, = x, =—.8449 
33-5 x, =0, x =.5667, x, =x, =.5658 3.3.7 x, =—1, x, =—2.8305, x, =—2.0496, 

X= —1.9387, x. = x, =—1.9346 3.3.9 x, =2, x, = 2.1667, x, = 2.1545, x, = x5 = 2.1544 
33-41 x, =3, x, = 4.5311, x, = 4.0488, x, = 3.7947, x, = 3.7311, x, = x, = 3.7276 

3.4.1 False 3.1.3 False 3.4.5 True 


& 


4.1 x=1 4.1.3 x=0,1 4.5 x=0 4.1.7 x =4; x =2,3; x =0,2,3;x%=4;x=0 
4.1.9 No solutions 4.1.11 y=3,x=6 4.1.13 y=3,23=2,x=3 4.1.15 x=] 

4.0.17 x =0, y=3; y=3,x=11 4.1.19 x =7; no solution in Z,, 4.1.21 501 4.1.23 4 
4.2.1 365 4.2.3 12 4.2.5 1 4.2.7 59 4.2.9 18 4.2.11 34 4.2.13 72 4.2.15 31 

4.2.17 65,521 4.2.19 1=17,11=1171,5=57',7577! 4.2.21 x =—42, y= 24 

4.3-1 Squares: 0,1,2,4 4.3.3 Squares: 0,1,3,4,9,10,12 4.3.5 0,1,6 4.3.7 0,1,5,8,12 
4.4.1 295° 4.4.3 641-6,700,417 4.4.5 270 4.4.15 325 4.n1 True 4.43 True 4.15 False 


4.1.7 True 
eR 


5.1.1 128x7 + 1,344x°y? + 6,048x° y4 + 15,120x4 6 + 22,680x7 y® + 20,412x7 919 + 
10,206x y'2 +2,187y!4 5.1.3 3x! 4+4y>z)5 5.1.5 2! 5.1.7 7,015,68007°b%c!? 

5.1.9 489,888 5.2.1 0,1,2,4,4,2,1 5.2.3 x=1 5.2.5 x=1 5.2.7 x=7 5.2.15 The 
primitive roots mod 11 are the powers 2” where 7 is relatively prime to 10, that is, 2, 8, 7, and 
6. 5.2.17 For p =2, the root is 1; for p = 3, the root is 2; for p = 5, the roots are 2 and 4; for 
p =7, the roots are 3 and 5; for p = 11, the roots are 2, 6, 7, and 8; for p = 13, the roots are 2, 
6,7, and 11; for p = 17, the roots are 3, 5, 6, 7, 10, 11, 12, and 14; for p = 19, the roots are 
2, 3, 10, 13, 14, and 15. 5.2.27 a=2and n=4 5.3.1 x? +y?+27+2xyt2xezt+2yz 
5.3.3 xt yt t zt t4xoy 44x y3 +4022 44x23 + 4yiz t4yz? +6x7y? +6x72? + Oy? 2? + 
12x? yz + ldxy?z 4 ldxy2? 

5-3-5 ot pt xP y? + 3x4 y? + 3x72 y5 + 3x y + 3x4 y? + 3xy7 43x72 y? + 6x3 y4 

5.3.7 887,400 5.3.9 2,355 5.4.1 p(24)=8; 9(144) = 48; and 9(1,000) = 400 5.41 True 
5.1.3 False 5.4.5 True 5.1.7 False 5.1.9 True 


R 


6.1.1 Quotient: x4 +x? +x? +x; remainder: x? +1 6.1.3 Quotient: x4 + 4x3 + x? + 4x +2; 
remainder: 2x7 +2x+4 6.1.11 x°+2x341 6.1.13 x9 +2x°+44x440342x74+4e41 
6.21 1x, xt], x7, x? +1l=(x 41), x? tx ax(x41), x72 4x41, x, 

w+ 1a(xtl(xPtx41), tx axe tl? txt], P4274, 4x72 =x72(x 41), 


SELECTED SOLUTIONS 435 


eaxrtxtla(x+1), x4, xt +1 a(x +1), xf tx ax(x tlle? 4x4), xitedl, 
xigxrPaxt(ixt Py xtta2? tl a(x? txt 1? xt tx? tx ax(x?4+x74-1), 
xigxrtxtla(xti)(x3 +2741), x84 22 = x3(x 41), e447 41, 

xe teax(xet+x2 41), xt te tut la(xt1P(x?2 4x41), 

xf gtx an (xrtetl), xftxe tx? 41a(xt (x? tx41), 

xe xd te xP tx ax(xtl?, ett x2 4x27 4x41 6.2.3 x7, x7 41, x7 4+2=(x 4 1)(x +2), 
xPtxax(xtl),x?txetla=(x4+2P , x? +042, x? +20 = x(x +2), 

x? +2xt1la(xt1yP,x?+2x4+2 6.2.5 xt +1 =(x4+1)(x?+4x4-1), P 4x41, 
txt], 43x41 a(x + 4x43), x +4041 = (x + 2)(x?2 4+ 3x43 

6.2.7 p(p?—1)/3 6.2.13 3x—1 6.2.15 3x+1 6.2.17 2=8, b=0 6.3.1 x? +x41 
6.3.3 1 6.3.5 1 6.3.9 No 6.3.11 No 

6.3.27 (x2 +1) = x(x? + x4 ¢xF +1) +(xP txt (xo +x4tx41) 6.4.1 a?-26 

6.4.3 c—ab 6.4.5 —b/c 6.4.7 (a27+6)/(c—ab) 6.4.9 x? +23x—1 

6.4.11 x? — 3x27 426% —23 6.4.13 29 +23hx? +h 6.4.15 5 6.4.17 £75, tV7i, 4 
6.4.19 (54 V13)/2 6.4.27 —a,_,/a, 6.4.29 4, 5/4, 6.5.1 +i, +1 6.5.2 w, 07, 0,1 
6.1.1 False 6.1.3 True 6.1.5 False 6.1.7 True 6.1.9 True 


& 


yi t, 7, P=ttl, oaedrtl arts, Pad t arte Par trsl, 
Pad(r+rtlHaPtrtrart leetrargl Parr tlaePtrate+el+r=al, 
71320 Pn wanrtl w=ptn 7 =pty, P=PprP=Pty +1, 

Pani tytn, = ty to =e tl, =p tae ttl, 1? =P +9? +0, 
MBanitptr’, tant tpanitPte tl, 
WPapPtnitptnanttptytnti, =p tnytpty ty =nity tnt, 
nap tyitytnantitntl, gap te tnantl Party Marty, 

7 =nity?, na=nity +1, q=pPtprtntl, wa=nit+ypt+ry+n, nani tytn, 
neanitytntl, on’ =p tnt, =anity ty, 1 =p +1, =n +7, 
pPl=pt+n=l. 

WL 8,8 SF +1, 2 = 2641, 8 H2, 8 a2, = 242, f =F +2, =1. 7.1.7 0,6, 
= 6? +2, 61 = 03 +20=67 42042, 6° = 2642, 6° = 267420, 6’ =67 +1, 88 =0* +042, 
0° = 267 +26+2, 6° =67 +2041, o'! =6+2, 6 =67 +28, 6 =2, 64 = 26, 64 = 267, 
66 = 267 +1, 0!” = 267 +2041, 08 =641, 0 =6? +0, 6 = 267 +2, 671 = 267 +2041, 
6 = 267 +642, 6% =20+1, 04 =267 +0, 0 =1. 

7.1.9 eC =a+4, & =S5a43, a =2046, OO =atl, o =2a44, ao =6a4], o =3, 

= 3a, a! =3a45, cl =at5, a? =6a4+4, a = 3043, 4 = 6045, ao = 4043, 
a! = 2, a!” = 2a, a! =2a41, a? =3a41, 0 = 4045, 2) =2042, a =4a41, 

0 = 5042, 4 =6, a = 6a, &§ =6at3, &? =2a4+3, &8% =S5atl, a =6a+6, 

@? = 5043, 2! =a46, a =4, o = 4a, 4 = 4042, & =6e42, © =a+3, 


436 SELECTED SOLUTIONS 


7 = 4044, 8% =a42, ao =3044, ao =5, a4! =5a, a? =5at6, a? =4a46, 

4 = 3942, a? =S5at5, a =3a46, a” =2a04+5, 08% =1. 7.1.11 y=1,x=1+8 
7.1.13 x =f, y= +B+6,2=1 7.4.15 y=2,x=8-2 7.1.17 y=B+2, 2=6+2, 
x =0 7.1.19 See Theorem 6.2 7.2.1 Each element except 0 and 1 has order 7. 7.2.3 Each 
element except 0 and 1 has order 31. 7.2.5 uw, us Bl, Bl, wl’, ul, and uv? have order 24; 
w, w', ul4, and yw have order 12; u?, u?, u', and pu?! have order 8; u4 and p?° have order 6; 
u® and w'® have order 4; u® and yu! have order 3; uw!” has order 2; 1 has order 1. 7.2.13 0 
except that when p =2 and v=1 the answer is 1. 7.3.1 a and e’=1+a 7.3.3 2 and 3 
7.3.5 9; e,c=otla=ototlb=4l, co =e 40746, 8 =o +07? +1, and 
o4=04+1 7.3.7 6, 0° =204+2, 6 =20, and o’ =o4+1 7.3.9 a,@ =at+1, a’ =6a+1, 
a! = 45, a? = 3043, a!? = 2a, a! = 3041, 0? =Sa+2, a = 6a, &? =6a4+6, 
Ol =at6, o =6a+2, & =4a+4, a! =5e, a? =4a4+6, of” =2a4+5 
74d x2 +4x4+2, x7 43x43, x7 4442, x7 42043 7.4.3 x9 2x41, xP 42x? 441, 
4x? 42x41, x9 +2x?741 7.4.11 x4 +x +41 is the minimal polynomial for o, 0”, o4, and 


o®; x4 +x%3+x%+x+41 is the minimal polynomial for 0’, o°, 0°, and o?; x? +x +1 is the 


14 


minimal polynomial of 0° and o'°; x4 +3 +1 is the minimal polynomial for 0’, o!4, o!3, and 


o!'; x and x +1 are the minimal polynomials of 0 and 1, respectively. 7.4.13 Using 
Exercise 7.1.8, x? + 4x +2 is the minimal polynomial for w and y°; 

(x —w?)(x — pl?) = x? +3x +4; (x —W?)(x — ul) = x? +3; (x —u*)(x —u) = x? +4x $1; 
(x —u? (x — Hie: +3x 435 (x- Bee ul) =x? 4x41; (x —w)(x —p"!) = x7 +2; 
(x —ul)(x —u!”) = x? +x +2; (x —pl4\(x —p) = x? + 2e +4; 

(x — pl? )(x — yu?) = x? + 2x +3; x —a is the minimal polynomial for all a € Z,. 

7.4.1 True 7.1.3 False 7.1.5 False 7.4.7 True 


& 


8.1.1 One 8.1.3 One 8.1.5 One 8.1.7 One 8.1.9 Two 8.1.11 Three 8.1.13 One 
8.1.15 Three 8.1.17 Six 8.1.19 Four 8.1.21 Six 8.1.23 Six 8.1.25 24 8.1.27 One 
8.1.29 One 8.1.31 20 8.1.33 15 8.1.35 120 8.1.37 x,x,°°°x, 8.1.39 Xp Xp X50 
B.1.4E (XX, +H, Xy) XoXo, 8.1.43 XPX_---e,_ yx, B21 (19865) 2347) 
8.2.3 (19)(28)(37)46)(5) 8.2.5 (1)(2),(12) 8.2.7 (1),(12), (13), (14), 
(23), (24), (34), (123), (132), (124), (142), (134), (143), (234), (243), 
(12)(34),(13)(24), (14)(23), (1234), (1243), (1324), (1342), (1423), 
(1432) 8.2.9 (137) 2496)(5)(8); orderis 12. 8.2.11 (17) 83) 254)(6)( 9); order 
is 6. 8.2.13 (152)(3479)( 68); order is 12. 8.2.15 (12)(23)(45)(56)67)(89) 
8.2.17 (19)(98)(86)(65)(23)(34)(47), 8.3.1 Yes, x, 8.3.3 Yes, x,—x, 8.3.5 Yes, 


A, = (x, — X))(x, ~ %3)(x, — x3). 8.3.7 Yes, x) x,x3x, 8.3.9 Yes, x,x,+x;x, 8.3.11 Yes, 


xy 


XXyX3X5%_ 8.3.13 No, by Corollary 8.11 8.3.15 Yes, x; 8.3.17 Yes, x, x,x3;%,x;%_ 8.3.19 No, 
by Theorem 8.10 with p =5 8.3.21 No, by Theorem 8.10 with p =7 8.3.23 Yes, x, x, 


SELECTED SOLUTIONS 437 


8.3.25 Yes, (x, x, + x54) X5%6°°+x, 8.4.1 Odd 8.4.3 Even 8.4.5 (1)(2) 8.4.7 (1), (123), 
(132), (124), (142), (134), (143), (234), (243), (12)(34), (1324), 
(14)(23) 8.4.9 (12),(13),(23) 8.4.23 Yes, x,A, 


n>5; x,d, ifn=4; A, ifn=3 8.21 False 8.1.3 True 8.1.5 True 8.1.7 False 8.1.9 False 


, 8.4.25 Yes, x, 5x,_)*,4,_, if 


ae 


9.1.1 Yes 9.1.3 No 9.1.5 No 9.1.7 Idand(12) 9.1.9 Allof S, 9.1.11 Id and (23) 
9.1.13 Id,(12)(34),(13)(24),and(14)(23) 9.1.15 Id, (12), (34), (12)(34), 
(13)(24), (14)(23), (1324), and (1423) 9.1.19 After Id, the rotations are grouped by 
the nature of their axes. Axis joining vertices: (5462), (56)(24), (5264), (1635), 
(13)(56),(1536), (1234), (13)(24), (1432); axis joining midpoints of edges: 
(12)(34)(56), (14)(23)(56), (16)(35)(24), (15)(36)( 24), (13) 26) 45), 
(13)(25)( 46); axis joining centers of opposite faces: (126)(534),(162)(543), 
(164) 235),(146)(253),(145)(263), (154)(236),(125)(463), 
(152)(436). 9.1.21 The icosahedron has 10 pairs of opposite faces. The line joining the 
center of such a pair of faces acts as the axis of two nontrivial rotations of angles 120°. This 
accounts for 20 symmetries. The icosahedron has 15 pairs of opposite edges. The line joining the 
midpoints of such an edge acts as the axis of a 180° rotation. This accounts for 15 symmetries. 
The icosahedron has six pairs of opposite vertices. The line joining each such pair of vertices acts as 
the axis of four nontrivial rotations of angles 72°, 144°, 216°, and 288°, respectively. This 
accounts for 24 symmetries. Including the identity, we thus have 60 symmetries. 9.1.23 (134); 
120° clockwise rotation about altitude from 2 9.1.25 (123); 120° clockwise rotation about 
altitude from 4 9.1.27 (14)( 23); 180° rotation about the line joining the centers of edges 14 
and 23 9.1.29 (136)(475); 120° rotation about the diagonal joining 2 and 8 

9.1.31 (17)(26)(35)( 48); 120° rotation about the axis joining the midpoints of the edges 
26 and 48 9.1.33 (14)(28)(35)(67); 120° rotation about the axis joining the midpoints of 
the edges 14 and 67 9.1.35 2” 9.2.1 Yes 9.2.3 No 9.2.5 No 9.2.7 No 9.2.9 No 

9.2.11 Yes 9.2.13 Yes 9.2.15 {0}, {1,3}, {2} 9.2.17 {1}, {ii}, {-1} 9.2.19 Each 
element is its own inverse. 9.2.21 {0}, {1,4}, {2,3} 9.2.23 {1}, {5} 9.2.25 {0}, {1,5}, 
{2,4}, {3} 9.2.27 {1}, {ae}, {6f}, {0.2}, {2} 9.3.3 No, K and (Z,,+) are not 


isomorphic to each other. 9.3.11 No 9.3.21 Define f(z) = 27 


. 94.1 m=1: {0}; m=2: 
{0}, Z,; m=3: {0}, Z,; m=4: {0}, {0,2}, Z,; m=5: {0}, Z; m=6: {0}, {0,3}, 
{0,2,4}, Z,; m=7: {0}, Z; m=8: {0}, {0,4}, {0,2,4,6}, Z,; m=9: {0}, {0,3,6}, 
Z,; m= 10: {0}, {0,5}, {0,2,4,6,8}, Z,, 

9.4.3 {Id}, {Id,(12)}, {Id,(123),(132)}, {Id (1234),(13)(24),(1432)}, S,, 
D,, Ay, and S,, respectively 9.4.5 {Id,(1234),(12)(345),(12),(345),(354)}, D,, 
D,, and A,, respectively 9.4.7 {0}, {0,6*} for &e{0,1,...,6}; {0,6', 8,8’ + B/ } where i 
and j are distinct elements of {0,1,...,6}; (GF(2, x? +x?4+1),+) 9.4.9 {1}, {14}, 


438 SELECTED SOLUTIONS 


{1,a,d,e}, {1,6,d, f}, {1,c,d, g¢}, and the whole group 9.4.11 also {1,4,7, 10,13} and 
{2,5,8,11,14} 9.4.13 also {1,10}, {2,11}, {3,12}, {413}, {5,14}, {6,15}, {7,16}, 
and {8,17} 9.4.15 also {(13),(123),{(23),(132)}} 

9.4.17 {Id,(12)34),(13)(24),(14)(23)}, {(123),(134),(243),(142)}, 
{(132),(234),(124),(143)} 9.4.19 also {ae}, {6,f}, and {c,g} 9.4.21 H and 
G—-H 9.4.23 k= 1: x,%,x33 k= 2: (x, —xy)(%, — 95 )(X, — 5)3 A= 3: 3 R =O: x, xp x35 no 
such function exists for k= 4, k=5,or R>6=3!. 9.4.25 (a) R= 1: x, x)x3%4x55 k= 2: As; 
k =5: x,; no such function for k =3 or k =4. (b) k=10: x,x,; & = 12: let f be the function 
such that Dz, = Ss, ¢3 no such function for k= 11, = 13, or R= 14. 9.4.29 No 9.4.31 S, 
9.4.33 Ay 9.4.35 D, 9.4.37 {Id,(12345),(13524),(14253),(15432)} 

9.4.47 {1,4} 9.4.49 A, 9.4.51 S,: ifm =1 or 2, then the center equals S,. For 2 > 2, the 
center of S) is {Id}. 9.5.3 (Zy+) 9.5.5 (Zy+) 9.5.7 K 9.5.9 (Zy+) 9.5.11 K 

9.5.15 2 9.5.17 10 9.5.19 5 9.5.21 21 9.5.23 4 9.6.1 P)=Id; P, =(01) 

9.6.3 P, =Id; P, =(0123); P,=(02)(13); P;=(0321) 9.6.5 P,=Id; 

P, =(01234); P,=(02413); Py =(03142); P,=(04321) 9.6.7 P,=Id; 
P,=(012345); P,=(024)(135); P, =(03)(14)(25); P,=(042)(153); 
P,=(054321) 9.11 True 9.13 False 9.1.5 True 9.1.7 True 9.1.9 True 9.1.11 False 
9.1.43 True 9.215 True 9.17 True 9.4.19 True 


& 


10.1.1 (Z,,+) 10.1.3 (Z;,+) 10.1.5 A is notnormalin G 10.1.7 (Z,+) 10.1.9 H is not 
normal in G ro.1.11 (Z,,+) 10.1.13 G/((1234)) =(Z,,+); G/{Id,(13)(24)}=K. 
to.1.15 G/(x)=K if x #0; G/{0,x, y,x+y} =(Z,,+) if x and y are distinct and nonzero. 
10.1.17 The nontrivial subgroups of G are { 1,07, 04,0°} and {1,04}. The quotients are 
isomorphic to (Z,, +) and (Z,, +), respectively. 10.1.19 G/K =(Z,,+). 10.1.29 It follows 
from Exercise 8.2.23 that each conjugacy class consists of a maximal set of permutations whose 
disjoint cycle decompositions all have the same number of &-cycles for each positive integer &. 
10.3.1 {at bV5|a,b€Q} 10.3.3 {a+ bila, beQ} 
10.3.5 {a+ bwla,beQ}={atbV-3|a,b<eQ} 
10.3.7 {a+ bV2+ci+div2|a, 6,c,d€Q} 10.3.9 GF(2,x7+x+1) 
10.3.11 GF(2,x4+x3+x?+x+41) 10.3.13 This is GF(2, x? + x +1), which has order 4. 
10.3.15 This is GF(2, x4 + x? +x? +x +1), which has order 16. 10.3.19 rr + a [*] 

1 


10.3.20 ell 10..1 False 10.1.3 False 10.4.5 True 


ae 


11.2.9 Groups (a), (d), (e), and (g) are all cyclic of order 8. Group (f) has every element but the 
identity of order 2, so it is isomorphic to group (i). Group (h) is commutative with an element of 


order 4 but none of order 8. Groups (b) and (c) are noncommutative and also not isomorphic to 


SELECTED SOLUTIONS 439 


each other. 11.2.11 Groups (a), (c), (d), and (e) are all cyclic and so isomorphic to each other. 
Groups (i) and (d) are isomorphic to each other. 11.1.1 False 11.4.3 False 11.2.5 True 


11.1.7 True 
ae 


12.1.1 (a) If neither is divisible by 3 then x? = 1 = y? (mod 3). It follows that 

2? =x? + y?=14+1=2. This, however, is impossible. (b) Suppose none of x, y, z is divisible by 
5. Then x?, y?, z” are each in the set {+1} (mod 5). Consequently x? + y* € {+2,0} which is 
disjoint from {+1}. Hence x? + y?#z?. 12.1.3 Primitive: (319, 360,481), (31, 480, 481), 
(600, 481, 769). Non primitive: (156, 455, 481), (481, 3108, 3145), (481, 8892, 8905). 

12.1.5 Primitive: (5, 12, 13). Non-primitive: (6,8, 10) 12.2.1 @'/? is interpreted as fa. If 

p =2 then a is a quadratic residue modulo 2 if and only a= 1 iff a = 1, which is obvious. 
12.2.3 Note that 5525 =5-5-13-17 and each of the primes 5, 13, 17 is the sum of two 
squares. Several applications of Brahmagupta’s proposition then yield 5255 = 147 +737. 

12.2.5 Suppose that p = 1 (mod 4). Then (—1)(?-/) = 1, so that (—1)(?-9/) is a solution of 
x? 4+1=1 (mod 4). 12.2.7 The only prime divisor of 2 of form 4 +3 is 3 and it has an even 
exponent of 2. Hence, by Proposition 12.25, the required triangle exists. 12.3.1 Let p be an odd 
prime and a not bea multiple of ». Then, modulo p, the following are equivalent for any 
number x: ax? + 6x +c =0 and (2ax + 6)? —(6? —4ac) =0. Consequently the first equation 
has 0, 1, or 2 distinct solutions according as 6? — 4ac is a quadratic nonresidue, 0, or a quadratic 
residue. 12.3.3 1, 4,9, 16, 25, 5, 18, 20, 14, 10, 8 12.3.5 Pair each quadratic residue a 
different from 1 and —1 with its multiplicative inverse 2* where aa* = 1 (mod p). Since the 
product of each pair is equivalent to 1 (mod p) it follows that the product of the quadratic 
residues of p is congruent to —1 (mod p) is —1 if —1 is a quadratic residue of p and to 1 
otherwise. 12.3.7 First suppose that p = 44 + 1. By the Law of Quadratic Reciprocity 

(7/ p)(p/7) = 1. However, this is (7/7) where r is the remainder when p is divided by 7 and it is 
easy to check that (r/7) = 1 for r= 1,2,4. Thus, p must be of the form 284 + 1, 284 +9 or 
28k +25. 12.3.9 (a) 1, (b) 1, (©) —1, (d) —1 12.3.11 Let S denote the given sum. Then 

6S =((p — 1)/2)((p + 1)/2)p = 0 (mod p) and hence p| 65S. Since p is neither 2 nor 3, we 
have that p|S. 12.4.1 13, 21, 21, 21,25 12.4.5 «=0, y=1 12.4.7 (2+i)(3—i)(5—2i) 
12.4.9 (2+i)(3+i)(7—2i? 12.411 8—2i=(2—i(3+i)+1—i 12.4.13 a) 3+i (C) 34+4i 
12.5.1 11, 11,15, 15,17 12.5.7 (14+ V¥—2)(2+5V—2) 

12.5.9 (1+ ¥—2)(2+5V—2)(5— V—2) 125.41 r=1—-2V—2, gq =34+V—2 125.13 (a) 
3+ ¥—2, (c) 3+4V—2 


& 


13.2.1 0,1, 2,3, ¥—5, 14+ V—5, 24+ V—5, 3+ 7-5, 44+ V7—5, 1+2V—5 
13.2.3 Irreducibles of norm < 30: 0, 1, 2, 3,V¥—5, 1+ V¥—5, 24+ V¥—5, 347-5, 44+ 7-5, 


440 SELECTED SOLUTIONS 


p —10= (mod p) p, (p) — N(p,) 
2 =0 (/=10,2) p; 2 

3 =2 (3) (3) 9 

5 =0 ( /=10,5) PS 5 

7 34 (2+ V-107) pp 7 

ll =1 (147-1011) pp, 12 
13. 33 (4+ 7=10,13) p,3P,, 13 
17 37 (17) (17) 289 
19 =9 (3+V-10,19)  PigPiy 19 
23 =13 (6+ ¥—10,23) ph, 23 


Table S.x Solution to Exercise 13.7.1 


1427-5, 3427-5. Others: 4= 2-2; 5 = /—5--V-5; 27-5 =2- V-5; 

2427-5 =2-(1+ V—5). 13.3.1 (a) True, (b) True, (c) False, (d) True, (e) False, (f) True, (g) 
True 13.3.3 (a) Principal: (9) (b) Principal: (2— v-6) (c) Principal: (2) (d) Principal (3) (e) 
Not Principal (f) Principal: (1) (g) Principal: (1) (h) Principal: (1) (i)Principal (1) (j)Principal: 
(1) 13.3.5 Suppose N(a + 6V—6) = 2. a? +66? =2. b?< : => b=0. a? =2 has no solution 
for a an integer. 13.3.7 pp = (5,2+ V—6) -(5,2—v-6) = (25, 10-5 V—6, 10+ 5v—6,10). 
5 divides each generator of pp, so pp C (5). 25—(2)-10=5, so 5€ pp and (5) c pp. 

qq = (4,4-2V—6,4+ 2vV—6,10). 2 divides each generator: pp c (2). 2=10—(2)4€ pp, so 
(2) ¢ pp. pq =(10,10+ 57-64 + 2V—6,(2 + V—6)). Since 10 = (2 + V—6)(2— v—6), 
each generator is divisible by 2+ /—6. (10 + 5 ¥—6) — 2(4 + 2V—6) = 2+ V—6€ pq. 

13.4.1 (a) 36 or —36, (b) 36 or —36, (c) 1 or —1, (d) 1 or —1 

13.4.3 Suppose it is principal; that is, there exists a such that a-8=3 and a-y =1+ V5. 
Then N(a)-N(8)=9 and N(a)-N(y) = 6. However, no element of Z[/—5] has norm 3. Thus 
N(a@) = 1. 

Consider 3(a + 6 /—5) + (1 + V—5)(c + d —5) = 1. This gives two equations of rational 
integers: 3a +c¢—5d =1 and 36+c+d=1. Then c—5d =1 (mod 3) and ¢ +d =0 (mod 3). 
Subtracting the equations gives —6d = 1 mod 3. 1 ¢ (3,1+ v5). 13.5.1 (a) 36, (b) 36, (c) 1, 
(d) 1 13.6.3 (5) isnot prime. (5,2+ V—6)(5,2— V6) =(5). 13.6.5 (6) is not prime. 
{2)-(3)=(6). 13.7.1 See Table S.1 13.7.3 See Table S.2 13.7.5 Let a and £ be rational 
primes. Suppose «6 ¢ J and @,f are not associates. Then (a, 8) = (1) since o,f are relatively 


prime rational integers. a,8¢ J means I|(a,8),s0 [= (1). 


SELECTED SOLUTIONS 441 


p -14= (mod p) pp, (p) —_ N&,) 
2 =0 (v-14,2) p3 2 

3 =1 (1+ V-143) pp; 3 

5 =1 (1+ 7-145) pss 5 

7. E08 (v-14,7) P; 7 

11 =8 (11) CULy.- “127 
13° =12 (5+V-1413) phy 13 
17 =3 (17) (17) 289 
19 =5 (9+-V-14,19)  pPyPjg 19 
23 =9 (3+ 7-14.23) p,;P), 23 


Table S.2 Solution to Exercise 13.7.3 


13.7.6 (5,-2+V¥—1)+(5,— ue A 5,5-(—2— ¥=1),5-(-2 + V=1), (-2+ 
V¥=1\(-2- V=1)) = (5-5,5-(— ia ae 


& 


14.1.1 Let x =a- b be rational integers. x is prime if and only if x | a or x | 6. Clearly a| x, so 
x|a if and only if (x) = (a) ifand only if 6 isaunit. a or 6 is a unit if and only if x is irreducible. 
14.1.3 Suppose 2 is divisible by a? for some rational oe n and a, nonunits. Consider *, 

which is a rational integer since a | 7. nt 2 and “y= es =n- +. Since a? | n, this last Aactiba is 
still a rational integer, which shows that I y. ne ntm S n|m?. Let m= pi! ps?... pi 
be the prime factorization of m; m? = =p" t pitt eh pi. Since n| m?, n has no prime factors 
which do not appear in the list {p,,...,p,}. Since 2 f m, at least one of these factors must appear 
with power greater than 1. Thus, there is a square which divides n. 14.1.5 By part (a) of Propo- 
sition 14.2, we have (—a)- 6 = ((—1)-@)- 6. Associativity gives = (—1)-(a- 6) =—(a- 6) which 
is the third term. Commutativity gives = (a -(—1))- 6 =a-((—1)- 6)=a-(—6). 14.1.7 Sup- 
pose 6 and c are both a multiplicative inverse of a. That is, a2-b=a-c=1. a-b=1 if 
and only if ¢-(4-6)=c-1=c if and only if (c-a)- 6 =c if and only if 1-6 =c if and only 
if b=c. 14.1.9 Ring. Units are 1 and 2 since 2-2=1. 14.1.11 Ring. Units are 1 
and 5. 2-3=4-3=0;5-5=1. 14.13.13. Nota ring; subtraction is not commutative. 
3—-141-3. 14.1.15 Not a ring; cross product is not commutative and has no identity. 
14.41.17 Nota ring; no additive (union-wise) inverse. 14.1.19 Ring. Units are f:R—R 


such that f(x) #0 forall xeR. 14.2.1 (a)(3,1+2V7—5)(3,1-2V—5) = (9,3 + 6V—5, 3- 


442 SELECTED SOLUTIONS 


i 1+i =1 0 1 -l-i -i 1-i 2-i —1+i 
l+i —l 0 1 -l-i -i 1-i 2-i —l+i i 
-1 0 1 -1-i —i 1-i 2-i —l+i i 1+i 


Table $.3. Addition table for Exercise 14.4.3 


6V¥—5,21) = (3). (b) (7,14 2V—5)(7, 1—2V=5) = (49,7 — 147-5, 7 + 147-5, 21) = (7). 
(c) (3,1 +2V—5)(7,1 + 2V—5) = (21,7 + 14V—5, 3 + 6V—5, (1 + 2V—5)?). Since 21 =(1 + 
2/-5)(1—2V-5), 1+ 2V—5 divides all generators. (7 + 14/—5) — 2(3 +6/—5 = 14+ 2V-5 
implies (1 + 2¥—5) € (21,7 + 14V—5,3 + 6V—5,(1+2/—5)’). Thus, (21,7 + 147-5, 3 + 
6V¥—5,(1+2V—5)*) = (1+2V5). (d) (3,1-2V—5)(7, 1-2-5) = (21,7 - 14-5, 3- 
6V—5,(1—2¥—5)*). Since 21 = (1 + 2V—5)(1 —2V—5), 1-275 divides all generators; 
(1-2-5) | (21,7—- 14-5, 3 —6V—5, (1 —2V—5)"). Since 7— 14V—5 — 2(3 - 6-5 = 
1-2-5, (21,7- 147-5, 3-6-5, (1 —2V—5)*) | (1-2V—5). 14.2.3 (a) (6)(3) = (18). 
(b) 6¢ (18), so 6¢ (18)X for any ideal X CZ. 14.2.5 Suppose J divides every ideal of R. 
In particular, /|(2), so that 2¢/. Also J|(3), so 3¢/. But 2,3€/ implies 3-2=1e/, so 
T=(1)=R. 14.3.1 Suppose R isa finite integral domain. Let x be a nonzero element of R 
and consider xa = x6. Then xa—xb = x(a—6)=0. Since R is an integral domain a— 6 =0. 
Thus, for any distinct 4,6, ax # bx. |(x)| is equal to the number of r € R such that ax =r 
for some a € R, which is the number of distinct elements a of R. Thus, (x)= for any xe R. 
R isa field. 14.4.3 One residue system is {0,1,-1Li,-i,.1+i1-i,-1+i,-—1-i2—i}; 
see Tables S.3 and S.4. The ring is isomorphic to Z,,. It is not a field. 14.4.5 (a) Let 
l={feF|f(0)=0}. If f.g €/, then (f + g)(0) = f(0)+ g(0)=04+0=0s0 f+ geF. If 
heF, f el, then (A- f)(0) = A(0)- f(0) = A(0)-0=0 so 4- fel. Thus, J is an ideal. Sup- 
pose O# he F/I. b(0)=a#0. Then f(x)= p(x)—ael. h(x) = A(x)— f(x) mod /, but 
A(x) — f(x) =a for all x € [0,1], which is an invertible function. Thus, 4 is a unit and F// isa 
field. 7 isa maximal ideal. (b) { f ¢ F | f(1/4)=0}, {fe F | f(1/2)=0}, {fe F | £B/4)=0}, 
{feF| f(Ij=0}. © L={feF | fO)=f(l/2)= fl) =0} Sh =the FIFO) =f0)= 
O}oL={feF| f(0)=0}. 


SELECTED SOLUTIONS 443 


i i 1-i ] 1+i 2-i -l-i —l —1l+i i 
1+i 1+i —l+i 1-i -l-i 0 1+i —-l+i 1-i -l-i 
-1 -l 1+i i -l+i  2-i 1-i ~i -l-i 1 


Table S.4 Multiplication table for Exercise 14.4.3 


15-puzzle, 171 

Abel, Niels, 6, 51 

abelian group, 195 
abstract group, 193 
al-Khwarizmi, 2 

Algebra, 5 

algebraic expression, 23 
algebraic solution, 23 
algebraically resolvable, 24 
alternating group, 186 
Archimedes, 1 

argument, 11 

argument principle, 14 
Arithmetica, 278 

Ars Magna, § 

ascending chain of ideals, 364 
associates, 295, 320 
automorphism, 205 
Binet’s Formula, 90 
binomial coefficient, 76 
Binomial Theorem, 77, 101 
Bombelli, Rafael, 5 
Brahmagupta, 278 
cancelable ideal, 337 
Cardano, Gerolamo, 1 
Carmichael numbers, 89 
Cartesian number, 38 
Cartesian representation, 10 
Cauchy, Augustin-Louis, 6 
center, 214, 266 
centralizer, 214, 265 


Chinese Remainder Theorem, 69 


class equation, 267 
codomain, 422 

common divisor, 62 
commutative group, 195 
complement, 421 

complete residue system, 370 
complex integers, 356 


444 


Index 


complex number, 9 
composite, 37 
composite ideal, 359 
composition of functions, 422 
congruent modulo x, 57 
conjugacy class, 234, 265 
conjugate, 14, 234, 265, 318 
conjugate ideal, 331 
conjunction, 413 
constant polynomial, 103 
constructible, 26 
contrapositive, 416 
converse, 416 
coset, 207 
cubic equation, 4 
cycle, 161 

decomposition, 162 
cyclic group, 215 
cyclic permutation, 161 
cyclic table, 135 
cyclotomic equation, 50 
de Laplace, Pierre-Simon, 6 
decomposable group, 264 
decomposition 

into disjoint cycles, 162 
Dedekind cuts, 314 
degree, 103 
del Ferro, Scipione, 1 
derivative, 154 
dihedral group, 188 
Diophantus, 278 
direct product, 261 
discriminant, 169 

of cubic, 49 


disjoint cycle decomposition, 162 


disjunction, 413 


Disquisitiones Arithmeticae, 26, 50 


distinct variants, 156 
divisible ideal, 359 


division of polynomials, 103 
divisor, 37 
divisors, 108 
domain, 422 
doubling of a cube, 34 
elementary symmetric polynomials, 121 
emptyset, 421 
equal ideals, 358 
equation 
cubic, 4 
first-degree, 2 
quadratic, 3 
equivalence relation, 425 
equivalent propositions, 413 
Erlanger Programm, 187 
Euclidean algorithm, 63 
Euclidean domain, 362 
Euclidean function, 362 
Euclidean greatest common divisor, 115 
Euler -function, 93 
Euler’s Criterion, 281 
Euler, Leonhard, 6, 33 
Eulerian integers, 304 
even permutation, 170 
existential quantifier, 418 
extension, 246 
factorable, 108 
factorization, 108 
factors, 108 
Fermat's Theorem, 86 
Ferrari, Lodovico, 49 
Fibonacci numbers, 84 
field, 99 
extension, 246 
isomorphism, 243 
field isomorphism, 243 
fifth-degree equation, 49 
first-degree equation, 2 
fourth-degree equation, 49 
Fundamental Theorem of Algebra, 51 
Fundamental Theorem of Symmetric Polynomi- 
als, 122 
Galois field, 135 
Galois imaginaries, 132 
Galois polynomial, 142 
Galois, Evariste, 50, 51 
Gauss, Carl Friedrich, 6, 26 
Gaussian integers, 294 
Gaussian primes, 294 


SELECTED SOLUTIONS 


generator 
of a group, 215 


greatest common divisor, 62, 113, 340 


Euclidean, 115 
ground field, 102 
group, 193 
abelian, 195 
alternating, 186 
center, 266 
center of, 214 
class equation, 267 
commutative, 195 
cyclic, 215 
decomposable, 264 
dihedral, 188 
direct product, 261 
generator of, 215 
indecomposable, 264 
isomorphism, 202 
Klein 4-, 187 
order, 202 
Quaternion, 195 
quotient, 230 
symmetric, 184 
group isomorphism, 202 
group of permutations, 184 
highest common factor, 62 
Hilbert, David, 355 
Hisab al-jabr w'al-muga-balah, 3 
homomorphism, 234 
ideal, 358 
cancelable, 337 
composite, 359 
divisible, 359 
irreducible, 359 
maximal, 368 
principle, 358 
unit, 358 
ideal multiplication, 335 
ideals, 322 
identity permutation, 158 
imaginary number, 10 
indecomposable integers, 313 
index, 210 
injective, 423 
integers, 356 
complex, 356 
irrational, 356 
rational, 356 


446 SELECTED SOLUTIONS 


integral domain, 361 
intersection, 421 
invariant function, 156 
irrational, 294 
irrational integers, 356 
irreducible, 108, 320, 356 
irreducible ideal, 359 
isomorphic, 370 

fields, 242 
isomorphic groups, 202 
isomorphism, 202, 243 
Jordan, Camille, 207 
kernel, 237 
Khayyam, Omar, 1 
Klein 4-group, 187 
Klein, Felix, 187 
Kummer, Ernst, 310 
Lagrange’s method, 127 
Lagrange’s Theorem, 207 
Latin square, 196 
lattice point, 292 
Law of Homomorphisms, 367 
Law of Quadratic Reciprocity, 70, 292 
left inverse, 423 
Legendre symbol, 285 
Lindemann, Ferdinand, 34 
maximal ideal, 368 
method of false position, 2 
method of generating functions, 81 
method of infinite descent, 283 
minimal polynomial, 152 
modular arithmetic, 57 
modular order, 87 
modular roots of unity, 87 
modulus, 11 
monic polynomial, 103 
monomorphism, 235 
Multinomial Theorem, 91 
multiple, 37 
multiplicatively perfect, 73 
multiplicity of a zero, 110 
negation, 413 
Newton-Raphson method, 51 
norm, 295, 318 
normal subgroup, 228 
number 

complex, 9 

imaginary, 10 
odd permutation, 170 


On the Theory of Complex Numbers, 310 
On the Theory of Numbers, 131 
order, 136 

modular, 87 

of a finite group, 202 

of a root of unity, 36 
parity of a permutation, 171 
Pascal’s Identity, 77 
Pascal’s Triangle, 77 
perfect number, 72 
permutation, 158, 425 

cyclic, 161 

even, 170 

odd, 170 

parity of, 171 
polar form, 12 
polynomial 

irreducible, 108 

minimal, 152 

monic, 103 

variants, 155 
polynomial over a field, 101 
prime, 37, 357 
prime factorization, 365 
prime ideal, 343 
prime subfield, 370 
primitive, 135, 150, 274 
Primitive Element Theorem, 144 
primitive root, 88 
principal ideal, 323 
principal ideal domain, 363 
principal ideals, 358 
product, 326 
propositional calculus, 413 
Pythagorean triple, 274 
quadratic domains, 356 
quadratic equation, 3 
quadratic field, 318 
quadratic nonresidue, 281 
quadratic residue, 281 
quartic equation, 49 
Quaternion group, 195 
quintic equation, 49 
quotient group, 230 
range, 422 
rational integers, 294, 356 
rational primes, 294 
reducible, 108 
relation, 425 


relatively prime, 63, 116 
Rhind Mathematical Papyrus, 1 
right inverse, 424 
ring, 355 
ring isomorphism, 370 
roots of unity, 18 
RSA encryption, 95 
Ruffini, Paolo, 157 
signature, 236 
solvable by radicals, 24 
subfield, 246 
subgroup, 206 
generator of, 215 
index of, 210 
normal, 228 
proper, 207 
trivial, 207 
subset, 421 
superset, 421 
surjective, 424 
symmetric group, 184 
Tartaglia, NiccolA?, x 
Taylor, Richard, 278 
trace, 318 


SELECTED SOLUTIONS 


Traité des Substitutions, 207 
transcendental numbers, 34 
transposition, 163 
trisection of an angle, 34 
trivial subgroup, 207 

truth tables, 413 

union, 421 

unique factorization property, 365 
unit, 320 

unit ideal, 358 

unit segment, 26 

units, 294 

universal quantifier, 418 
universe, 421 

variables, 101 

variants, 155 

vertex symmetries, 187 
Wantzel, Pierre, 35 

Wiles, Andrew, 278 
Wilson’s Theorem, 280 
zero divisor, 361 

zero of a polynomial, 107 
zero polynomial, 103 


447 


A,, 186 

argz, 11 

(4), 76 

Z(G), 214 

Z 214 

C, 100 

C(a), 234 

Z,14 

a, 331 

aH, 207 
[M(x)], 244 
(abcd), 161 
D,,, 188 

G x H, 261 

A,, 169 

m|n, 37 

a= b (mod n), 57 
Xx+, 283 

GF(p, P(x)), 135 


(m,n), 62 


448 


Notation 


i, 9 

a, 323 

1g, 193 

Id, 158 
([G:H], 210 
Z, 23 

Zi,» 195 

Z,,» 58 
2[v—5], 72 
a*, 193 

G =H, 202 
Ker f, 237 
K,192 
ass 
|z|, 11 
N(z), 72 
N(a@), 318 
N(a + bi), 295 
loll, 165 
N(a), 341 


Vz, 17 
0(Z), 36, 136 
o({2), 203 
o(7), 87 


(aay) 28 


on), 93 
F[x,Sn], 195 
F [x], 102 
G/H, 230 
Q, 100 

Q", 294 

R, 100 

(a), 185 
S-T, 225 
S|, 225 
Dory ry py 21 
A(A,B), 358 
S,, 184 
Tr(a), 318 

T, 236 


